Načítanie knižníc
## Loading required package: Hmisc
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
##
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
##
## format.pval, units
## funModeling v.1.9.4 :)
## Examples and tutorials at livebook.datascienceheroes.com
## / Now in Spanish: librovivodecienciadedatos.ai
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ tibble 3.1.0 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ✓ purrr 0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::between() masks data.table::between()
## x dplyr::filter() masks stats::filter()
## x dplyr::first() masks data.table::first()
## x dplyr::lag() masks stats::lag()
## x dplyr::last() masks data.table::last()
## x dplyr::src() masks Hmisc::src()
## x dplyr::summarize() masks Hmisc::summarize()
## x purrr::transpose() masks data.table::transpose()
##
## Attaching package: 'bnlearn'
## The following object is masked from 'package:Hmisc':
##
## impute
Načítanie dát
Vybrali sme si dataset “Used Cars Dataset from Craigslist.org” dostupný na: https://www.kaggle.com/austinreese/craigslist-carstrucks-data.
data <- read.csv(file = 'data/vehicles.csv', header = TRUE, na.strings=c(""," ","NA","0"))
data = subset(data, select = -1)
V tejto časti si prejdeme a popíšeme dataset. Bližšie sa pozrieme akých sú hodnôt a čo popisujú.
c('number of columns:', ncol(data))
## [1] "number of columns:" "25"
c('number of rows',nrow(data))
## [1] "number of rows" "458213"
K dispozícii máme necelých 460 tisíc záznamov.
names(data)
## [1] "id" "url" "region" "region_url" "price"
## [6] "year" "manufacturer" "model" "condition" "cylinders"
## [11] "fuel" "odometer" "title_status" "transmission" "VIN"
## [16] "drive" "size" "type" "paint_color" "image_url"
## [21] "description" "state" "lat" "long" "posting_date"
Atribútov máme celkovo 25.
číselný 10 miestny údajurl - celá cesta, odkiaľ sa inzerát na vozidlo stiahol - url/stringstringurl - cesta ku kategorii regionu na danom inzerčnom portáli - url/stringčíselný údajčíselný údajstringstringstringčíselný údajstringstringstringstringstringurl - cesta k obrázku daného vozidla - url/stringNižšie máme menšiu ukážku dát a ich bližšiu špecifikáciu.
head(data,5)
## id
## 1 7240372487
## 2 7240309422
## 3 7240224296
## 4 7240103965
## 5 7239983776
## url
## 1 https://auburn.craigslist.org/ctd/d/auburn-university-2010-chevy-chevrolet/7240372487.html
## 2 https://auburn.craigslist.org/cto/d/auburn-2014-hyundai-sonata-20t/7240309422.html
## 3 https://auburn.craigslist.org/cto/d/auburn-2006-bmw-x3/7240224296.html
## 4 https://auburn.craigslist.org/cto/d/lanett-truck/7240103965.html
## 5 https://auburn.craigslist.org/cto/d/auburn-2005-ford-f350-lariat/7239983776.html
## region region_url price year manufacturer
## 1 auburn https://auburn.craigslist.org 35990 2010 chevrolet
## 2 auburn https://auburn.craigslist.org 7500 2014 hyundai
## 3 auburn https://auburn.craigslist.org 4900 2006 bmw
## 4 auburn https://auburn.craigslist.org 2000 1974 chevrolet
## 5 auburn https://auburn.craigslist.org 19500 2005 ford
## model condition cylinders fuel odometer title_status
## 1 corvette grand sport good 8 cylinders gas 32742 clean
## 2 sonata excellent 4 cylinders gas 93600 clean
## 3 x3 3.0i good 6 cylinders gas 87046 clean
## 4 c-10 good 4 cylinders gas 190000 clean
## 5 f350 lariat excellent 8 cylinders diesel 116000 lien
## transmission VIN drive size type paint_color
## 1 other 1G1YU3DW1A5106980 rwd <NA> other <NA>
## 2 automatic 5NPEC4AB0EH813529 fwd <NA> sedan <NA>
## 3 automatic <NA> <NA> <NA> SUV blue
## 4 automatic <NA> rwd full-size pickup blue
## 5 automatic <NA> 4wd full-size pickup blue
## image_url
## 1 https://images.craigslist.org/00N0N_ipkbHVZYf4w_0gw0co_600x450.jpg
## 2 https://images.craigslist.org/00s0s_gBHYmJ5o7yM_0ne0hq_600x450.jpg
## 3 https://images.craigslist.org/00B0B_5zgEGWPOrt0_07L0ak_600x450.jpg
## 4 https://images.craigslist.org/00M0M_6o7KcDpArwl_0CI0t2_600x450.jpg
## 5 https://images.craigslist.org/00p0p_b95l1EgUfly_0CI0t2_600x450.jpg
## description
## 1 Carvana is the safer way to buy a car During these uncertain times, Carvana is dedicated to ensuring safety for all of our customers. In addition to our 100% online shopping and selling experience that allows all customers to buy and trade their cars without ever leaving the safety of their house, we’re providing touchless delivery that make all aspects of our process even safer. Now, you can get the car you want, and trade in your old one, while avoiding person-to-person contact with our friendly advocates. There are some things that can’t be put off. And if buying a car is one of them, know that we’re doing everything we can to keep you keep moving while continuing to put your health safety, and happiness first. Vehicle Stock# 2000721559📱 Want to instantly check this car’s availability? Call us at 334-758-9176Just text that stock number to 855-976-4304 or head to http://www.carvanaauto.com/6143424-74502 and plug it into the search bar!Get PRE-QUALIFIED for your auto loan in 2 minutes - no hit to your credit:http://finance.carvanaauto.com/6143424-74502Looking for more cars like this one? We have 94 Chevrolet Corvette in stock for as low as $27990!Why buy with Carvana? We have one standard: the highest. Take a look at just some of the qualifications all of our cars must meet before we list them.150-POINT INSPECTION: We put each vehicle through a 150-point inspection so that you can be 100% confident in its quality and safety. See everything that goes into our inspections at:http://www.carvanaauto.com/6143424-74502NO REPORTED ACCIDENTS: We do not sell cars that have been in a reported accident or have a frame or structural damage.7 DAY TEST OWN MONEY BACK GUARANTEE: Every Carvana car comes with a 7-day money-back guarantee. Why? It takes more than 15-minutes to make a decision on your next car. Learn more about test owning at http://about.carvanaauto.comFLEXIBLE FINANCING, TRADE INS WELCOME: We’re all about real-time financing without the middle man. Need financing? Pick a combination of down and monthly payments that work for you. Have a trade-in? We’ll give you a value in 2 minutes. Check out everything about our financing at:http://finance.carvanaauto.com/6143424-74502COST SAVINGS: Carvana's business model has fewer expenses and no bloated fees compared to your local dealership. See how much we can save you at http://about.carvanaauto.comPREMIUM DETAIL: We go the extra mile so that your car is looking as good as new. There are a lot of specifics that we won’t list here (we wash, clean, buff, paint, polish, wax, seal), but trust us that when your car arrives, it’s going to look sweet.Vehicle Info for Stock# 2000721559Trim: Grand Sport Convertible 2D ConvertibleMileage: 32k milesExterior Color: GrayInterior Color: BlackEngine: 6.2L V8 430hp 424ft. lbs.Drive: rwdTransmission: Automatic, 6-Spd w/Paddle ShiftVIN: 1G1YU3DW1A5106980Dealer Disclosure: Price excludes tax, title, and registration (which we handle for you).Disclaimer: You agree that by providing your phone number, Carvana, or Carvana’s authorized representatives*, may call and/or send text messages (including by using equipment to automatically dial telephone numbers) about your interest in a purchase, for marketing/sales purposes, or for any other servicing or informational purpose related to your account. You do not have to consent to receiving calls or texts to purchase from Carvana. While every reasonable effort is made to ensure the accuracy of the information for this Chevrolet Corvette, we are not responsible for any errors or omissions contained in this ad. Please verify any information in question with Carvana at 334-758-9176*Including, but not limited to, Bridgecrest Credit Company, GO Financial and SilverRock Automotive.*Chevrolet* *Corvette* *Chevy* *Chevrolet* *Corvette* *vZR1* *Chevrolet* *Corvette* *Z06* *Hardtop* *Chevrolet* *Corvette* *Stingray* *Chevrolet* *Corvette* *3* *Lt* *Chevrolet* *Corvette* *C5-R* *Chevrolet* *Corvette* *Grand* *Sport* *Chevrolet* *Corvette* *Corvette* *C6* *ZR1* *Chevrolet* *Corvette* *2LT* *Chevrolet* *Corvette* *4LT* *Sports* *Car* *Coupe* 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 21 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
## 2 I'll move to another city and try to sell my car. The car is in very good condition, everything works and fully cleaned. It equipped with a heated seat, power seat, backup camera, Bluetooth, keyless entry and start. If you are interested in my car, please email me.
## 3 Clean 2006 BMW X3 3.0I. Beautiful and rare Blue Water Metallic exterior and tan interior color combination. 5-speed automatic transmission, AWD, CD/AM/FM radio, cold A/C (just serviced along w/oil change), alloy wheels, split rear-seats, driver/passenger air bags, multi-function remote w/ keyless entry, electric windows/door locks, cruise control, lighted vanity mirrors and many other extras. Missing tow eye cover in rear (~$20 to replace), tires ~50% tread remaining and a few blemishes on the exterior (scuffs, scratches, normal wear & tear). Would make an excellent, safe run-around town or college vehicle. Title in hand, priced to sell!
## 4 1974 chev. truck (LONG BED) NEW starter front and back breaks
## 5 2005 Ford F350 Lariat (Bullet Proofed). This truck was bullet proofed early on and has been well maintained. Truck is equiped with a 6.0 liter turbo diesel. Currently has 116K miles. Everything on the truck works as it should, truck is in excellent condition. Truck is in all original condition (except for the Bullet Proof upgrades). Truck comes equipped with gooseneck hitch, 15,000 lbs bumper hitch, brake controller, and upfitter switches. Has 430 limited slip gears. Fully loaded interior with heated leather power seats and power sliding back glass. It is an excellent choice for hauling a 5th wheel camper or for anyone needing to haul heavy loads. If you are looking for a pre-emission controlled diesel, you will not find a better truck than this one. Price is firm, Call Mark at show contact info
## state lat long posting_date
## 1 al 32.59000 -85.48000 2020-12-02T08:11:30-0600
## 2 al 32.54750 -85.46820 2020-12-02T02:11:50-0600
## 3 al 32.61681 -85.46415 2020-12-01T19:50:41-0600
## 4 al 32.86160 -85.21610 2020-12-01T15:54:45-0600
## 5 al 32.54750 -85.46820 2020-12-01T12:53:56-0600
Pozrime sa na počet záznamov v datasete, ktoré obsahujú nejakú chýbajúcu hodnotu atribútu.
na_rows <- data[rowSums(is.na(data)) > 0,]
nrow(na_rows)
## [1] 418512
Vidíme že je to dosť veľký počet dát. Môže za to pridanie parametra na.strings=c(""," ","NA","0") pri načítaní súborov. Môžme si pozrieť malú vzorku chýbajúcich dát:
na_rows[1:5,]
## id
## 1 7240372487
## 2 7240309422
## 3 7240224296
## 4 7240103965
## 5 7239983776
## url
## 1 https://auburn.craigslist.org/ctd/d/auburn-university-2010-chevy-chevrolet/7240372487.html
## 2 https://auburn.craigslist.org/cto/d/auburn-2014-hyundai-sonata-20t/7240309422.html
## 3 https://auburn.craigslist.org/cto/d/auburn-2006-bmw-x3/7240224296.html
## 4 https://auburn.craigslist.org/cto/d/lanett-truck/7240103965.html
## 5 https://auburn.craigslist.org/cto/d/auburn-2005-ford-f350-lariat/7239983776.html
## region region_url price year manufacturer
## 1 auburn https://auburn.craigslist.org 35990 2010 chevrolet
## 2 auburn https://auburn.craigslist.org 7500 2014 hyundai
## 3 auburn https://auburn.craigslist.org 4900 2006 bmw
## 4 auburn https://auburn.craigslist.org 2000 1974 chevrolet
## 5 auburn https://auburn.craigslist.org 19500 2005 ford
## model condition cylinders fuel odometer title_status
## 1 corvette grand sport good 8 cylinders gas 32742 clean
## 2 sonata excellent 4 cylinders gas 93600 clean
## 3 x3 3.0i good 6 cylinders gas 87046 clean
## 4 c-10 good 4 cylinders gas 190000 clean
## 5 f350 lariat excellent 8 cylinders diesel 116000 lien
## transmission VIN drive size type paint_color
## 1 other 1G1YU3DW1A5106980 rwd <NA> other <NA>
## 2 automatic 5NPEC4AB0EH813529 fwd <NA> sedan <NA>
## 3 automatic <NA> <NA> <NA> SUV blue
## 4 automatic <NA> rwd full-size pickup blue
## 5 automatic <NA> 4wd full-size pickup blue
## image_url
## 1 https://images.craigslist.org/00N0N_ipkbHVZYf4w_0gw0co_600x450.jpg
## 2 https://images.craigslist.org/00s0s_gBHYmJ5o7yM_0ne0hq_600x450.jpg
## 3 https://images.craigslist.org/00B0B_5zgEGWPOrt0_07L0ak_600x450.jpg
## 4 https://images.craigslist.org/00M0M_6o7KcDpArwl_0CI0t2_600x450.jpg
## 5 https://images.craigslist.org/00p0p_b95l1EgUfly_0CI0t2_600x450.jpg
## description
## 1 Carvana is the safer way to buy a car During these uncertain times, Carvana is dedicated to ensuring safety for all of our customers. In addition to our 100% online shopping and selling experience that allows all customers to buy and trade their cars without ever leaving the safety of their house, we’re providing touchless delivery that make all aspects of our process even safer. Now, you can get the car you want, and trade in your old one, while avoiding person-to-person contact with our friendly advocates. There are some things that can’t be put off. And if buying a car is one of them, know that we’re doing everything we can to keep you keep moving while continuing to put your health safety, and happiness first. Vehicle Stock# 2000721559📱 Want to instantly check this car’s availability? Call us at 334-758-9176Just text that stock number to 855-976-4304 or head to http://www.carvanaauto.com/6143424-74502 and plug it into the search bar!Get PRE-QUALIFIED for your auto loan in 2 minutes - no hit to your credit:http://finance.carvanaauto.com/6143424-74502Looking for more cars like this one? We have 94 Chevrolet Corvette in stock for as low as $27990!Why buy with Carvana? We have one standard: the highest. Take a look at just some of the qualifications all of our cars must meet before we list them.150-POINT INSPECTION: We put each vehicle through a 150-point inspection so that you can be 100% confident in its quality and safety. See everything that goes into our inspections at:http://www.carvanaauto.com/6143424-74502NO REPORTED ACCIDENTS: We do not sell cars that have been in a reported accident or have a frame or structural damage.7 DAY TEST OWN MONEY BACK GUARANTEE: Every Carvana car comes with a 7-day money-back guarantee. Why? It takes more than 15-minutes to make a decision on your next car. Learn more about test owning at http://about.carvanaauto.comFLEXIBLE FINANCING, TRADE INS WELCOME: We’re all about real-time financing without the middle man. Need financing? Pick a combination of down and monthly payments that work for you. Have a trade-in? We’ll give you a value in 2 minutes. Check out everything about our financing at:http://finance.carvanaauto.com/6143424-74502COST SAVINGS: Carvana's business model has fewer expenses and no bloated fees compared to your local dealership. See how much we can save you at http://about.carvanaauto.comPREMIUM DETAIL: We go the extra mile so that your car is looking as good as new. There are a lot of specifics that we won’t list here (we wash, clean, buff, paint, polish, wax, seal), but trust us that when your car arrives, it’s going to look sweet.Vehicle Info for Stock# 2000721559Trim: Grand Sport Convertible 2D ConvertibleMileage: 32k milesExterior Color: GrayInterior Color: BlackEngine: 6.2L V8 430hp 424ft. lbs.Drive: rwdTransmission: Automatic, 6-Spd w/Paddle ShiftVIN: 1G1YU3DW1A5106980Dealer Disclosure: Price excludes tax, title, and registration (which we handle for you).Disclaimer: You agree that by providing your phone number, Carvana, or Carvana’s authorized representatives*, may call and/or send text messages (including by using equipment to automatically dial telephone numbers) about your interest in a purchase, for marketing/sales purposes, or for any other servicing or informational purpose related to your account. You do not have to consent to receiving calls or texts to purchase from Carvana. While every reasonable effort is made to ensure the accuracy of the information for this Chevrolet Corvette, we are not responsible for any errors or omissions contained in this ad. Please verify any information in question with Carvana at 334-758-9176*Including, but not limited to, Bridgecrest Credit Company, GO Financial and SilverRock Automotive.*Chevrolet* *Corvette* *Chevy* *Chevrolet* *Corvette* *vZR1* *Chevrolet* *Corvette* *Z06* *Hardtop* *Chevrolet* *Corvette* *Stingray* *Chevrolet* *Corvette* *3* *Lt* *Chevrolet* *Corvette* *C5-R* *Chevrolet* *Corvette* *Grand* *Sport* *Chevrolet* *Corvette* *Corvette* *C6* *ZR1* *Chevrolet* *Corvette* *2LT* *Chevrolet* *Corvette* *4LT* *Sports* *Car* *Coupe* 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 21 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
## 2 I'll move to another city and try to sell my car. The car is in very good condition, everything works and fully cleaned. It equipped with a heated seat, power seat, backup camera, Bluetooth, keyless entry and start. If you are interested in my car, please email me.
## 3 Clean 2006 BMW X3 3.0I. Beautiful and rare Blue Water Metallic exterior and tan interior color combination. 5-speed automatic transmission, AWD, CD/AM/FM radio, cold A/C (just serviced along w/oil change), alloy wheels, split rear-seats, driver/passenger air bags, multi-function remote w/ keyless entry, electric windows/door locks, cruise control, lighted vanity mirrors and many other extras. Missing tow eye cover in rear (~$20 to replace), tires ~50% tread remaining and a few blemishes on the exterior (scuffs, scratches, normal wear & tear). Would make an excellent, safe run-around town or college vehicle. Title in hand, priced to sell!
## 4 1974 chev. truck (LONG BED) NEW starter front and back breaks
## 5 2005 Ford F350 Lariat (Bullet Proofed). This truck was bullet proofed early on and has been well maintained. Truck is equiped with a 6.0 liter turbo diesel. Currently has 116K miles. Everything on the truck works as it should, truck is in excellent condition. Truck is in all original condition (except for the Bullet Proof upgrades). Truck comes equipped with gooseneck hitch, 15,000 lbs bumper hitch, brake controller, and upfitter switches. Has 430 limited slip gears. Fully loaded interior with heated leather power seats and power sliding back glass. It is an excellent choice for hauling a 5th wheel camper or for anyone needing to haul heavy loads. If you are looking for a pre-emission controlled diesel, you will not find a better truck than this one. Price is firm, Call Mark at show contact info
## state lat long posting_date
## 1 al 32.59000 -85.48000 2020-12-02T08:11:30-0600
## 2 al 32.54750 -85.46820 2020-12-02T02:11:50-0600
## 3 al 32.61681 -85.46415 2020-12-01T19:50:41-0600
## 4 al 32.86160 -85.21610 2020-12-01T15:54:45-0600
## 5 al 32.54750 -85.46820 2020-12-01T12:53:56-0600
summary(data)
## id url region region_url
## Min. :7208549803 Length:458213 Length:458213 Length:458213
## 1st Qu.:7231952523 Class :character Class :character Class :character
## Median :7236408504 Mode :character Mode :character Mode :character
## Mean :7235233427
## 3rd Qu.:7239320847
## Max. :7241019367
##
## price year manufacturer model
## Min. : 1 Min. :1900 Length:458213 Length:458213
## 1st Qu.: 5995 1st Qu.:2008 Class :character Class :character
## Median : 12394 Median :2013 Mode :character Mode :character
## Mean : 43635 Mean :2011
## 3rd Qu.: 22900 3rd Qu.:2016
## Max. :3615215112 Max. :2021
## NA's :33753 NA's :1050
## condition cylinders fuel odometer
## Length:458213 Length:458213 Length:458213 Min. : 0
## Class :character Class :character Class :character 1st Qu.: 40877
## Mode :character Mode :character Mode :character Median : 87641
## Mean : 101670
## 3rd Qu.: 134000
## Max. :2043755555
## NA's :55303
## title_status transmission VIN drive
## Length:458213 Length:458213 Length:458213 Length:458213
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## size type paint_color image_url
## Length:458213 Length:458213 Length:458213 Length:458213
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## description state lat long
## Length:458213 Length:458213 Min. :-82.61 Min. :-164.09
## Class :character Class :character 1st Qu.: 34.60 1st Qu.:-110.89
## Mode :character Mode :character Median : 39.24 Median : -88.31
## Mean : 38.53 Mean : -94.38
## 3rd Qu.: 42.48 3rd Qu.: -81.02
## Max. : 82.05 Max. : 150.90
## NA's :7448 NA's :7448
## posting_date
## Length:458213
## Class :character
## Mode :character
##
##
##
##
Ako významné atribúty, s ktorými budeme dalej pracovať, sme vybrali:
Pozrieme sa či v atribúte máme nejaké prázdné hodnoty:
sum(is.na(data$price))
## [1] 33753
Pozrieme sa na vychýlené hodnoty v atribúte
boxplot(data$price, las=3)
Vidíme, že v atribúte price máme zopár dosť vysokých hodnôt. Buď ide o luxusné autá, alebo o “špinavé” hodnoty.
data[order(-data$price),][1:5,c('manufacturer','model','year','price')]
## manufacturer model year price
## 385435 chevrolet silverado 2500hd 2006 3615215112
## 425189 jeep <NA> 2003 2857993261
## 38376 gmc <NA> 2020 2808348671
## 1623 chevrolet <NA> 1955 1234567890
## 306218 ram 3500 crewcab tradesma 2018 123456789
Ako sme sa mohli presvedčiť, o žiadne Ferarri ani Lamborghini nejde, sú to nesprávne vyplnené dáta.
nrow(data[is.null(data$price),])
## [1] 0
nrow(data[is.na(data$price),])
## [1] 33753
Čo nás však neteší je však, ako sme sa mohli presvedčiť, aj fakt že dosť vela hodnôt (33 753) tohto atribútu je nulových (NA). Domnievame sa, že ide o umelo znížené alebo nešpecifikované sumy, práve z dôvodu rankingu v rebríčkoch inzerátov. Preto sme ich (nuly) v úvode, pri načítaní datasetu, nahradili NA. Vysvetľujeme to tým, že táto praktika sa používa pokiaľ chcete mať inzeráty na prvých stránkach, pretože ludia bežne hľadajú od najnižšej ceny po svoj cenový strop.
Okrem toho tam máme vysoké hodnoty, s ktorými sa popasujeme vo fáze čistenia dát.
boxplot(data$price, las=3)
ggplot(data = data, aes(sample=price)) +
stat_qq() +
stat_qq_line() +
scale_y_continuous(breaks = seq(0, 5000000, by = 50000))
## Warning: Removed 33753 rows containing non-finite values (stat_qq).
## Warning: Removed 33753 rows containing non-finite values (stat_qq_line).
Počet prázdnych hodnôt:
sum(is.na(data$year))
## [1] 1050
Ako možeme vidieť, atribút rok má 1050 prázdnych hodnôt.
Jednou z možností ktorú sa domnievame je že v Amerike je bežné že sa autá prestavujú, je možné že predávajúci rok neudal z dôvodu, takejto prestavby, kde rok nehrá žiadnu rolu, napríklad: Karoséria vozidla je z roku 1970 a implementovaná technika z roku 2018.
data[is.na(data$year),][1:5,]
## id
## 16 7236904120
## 384 7238204872
## 470 7237486859
## 485 7237299924
## 850 7235124536
## url
## 16 https://auburn.craigslist.org/ctd/d/royal-palm-beach-2019-ram-1500-big-horn/7236904120.html
## 384 https://bham.craigslist.org/ctd/d/new-castle-2019-nissan-sentra-cvt-gun/7238204872.html
## 470 https://bham.craigslist.org/ctd/d/vicksburg-2019-chevrolet-silverado/7237486859.html
## 485 https://bham.craigslist.org/ctd/d/new-castle-2018-jeep-compass-latitude/7237299924.html
## 850 https://bham.craigslist.org/ctd/d/new-castle-2018-toyota-highlander-xle/7235124536.html
## region region_url price year manufacturer
## 16 auburn https://auburn.craigslist.org 38500 NA <NA>
## 384 birmingham https://bham.craigslist.org 14500 NA <NA>
## 470 birmingham https://bham.craigslist.org 41800 NA <NA>
## 485 birmingham https://bham.craigslist.org 18700 NA <NA>
## 850 birmingham https://bham.craigslist.org 28900 NA <NA>
## model condition cylinders fuel odometer title_status
## 16 500 <NA> 8 cylinders gas 28246 clean
## 384 n Sentra <NA> 4 cylinders gas 22546 clean
## 470 olet Silverado 2500HD <NA> 8 cylinders diesel 80910 <NA>
## 485 Compass <NA> 4 cylinders gas 18316 clean
## 850 a Highlander <NA> 6 cylinders gas 63061 clean
## transmission VIN drive size type paint_color
## 16 automatic 1C6RREMT7KN655834 rwd <NA> pickup white
## 384 automatic 3N1AB7AP9KY380549 fwd <NA> sedan grey
## 470 automatic 1GC1KSEY9KF121232 4wd <NA> pickup white
## 485 automatic 3C4NJCBB4JT108564 fwd <NA> SUV red
## 850 automatic 5TDKZRFH9JS546309 fwd <NA> SUV white
## image_url
## 16 https://images.craigslist.org/00Y0Y_65ISqDroMwD_0kE0bC_600x450.jpg
## 384 https://images.craigslist.org/00d0d_4rIc0Iq9E49_0kE0dM_600x450.jpg
## 470 https://images.craigslist.org/00b0b_jsyRZLmAUwW_0kE0fu_600x450.jpg
## 485 https://images.craigslist.org/00q0q_8PVU7cjYgfw_0kE0dN_600x450.jpg
## 850 https://images.craigslist.org/00S0S_iAz66PJkfz0_0kE0dL_600x450.jpg
## description
## 16 2019 *Ram* *1500* Big Horn/Lone Star 4x2 Crew Cab 6'4" Box Truck - $38,500Call Us Today! 561-693-0621Text Us Today! 561-203-4849Ram_ 1500_ For Sale by Peterson Motorcars Call For The Best Deals Today! Vehicle Description For This *Ram* *1500*PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409Clean carfax, one owner, Florida truck, Big horn Sport, 6'4" box, 8.4" touch screen with apple carplay, bluetooth, usb, mp3, backup camera, power sliding rear window, power seats, keyless go, 5.7L v8, Sport appearance package, power adjustable pedals, level 1 equiptment group, power folding mirrors, 3.92 axle ratio, anti spin rear differential and so much more! Call 561 371 5504 or visit www.PetersonMotorcars.com for more information and photos! PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409 View additional pictures and details This Ram_ 1500_ at: http://www.petersonmotorcars.com/details-2019-ram-1500-big_horn_lone_star_4x2_crew_cab_6_4_box-used-1c6rremt7kn655834.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist Vehicle Details For This *Ram* *1500* Year: 2019 Make: Ram Model: 1500 Trim: Big Horn/Lone Star 4x2 Crew Cab 6'4" Box VIN: 1C6RREMT7KN655834 Stock#: X655834 Condition: Used Clear Title Miles: 28,246 Exterior Color: Bright White Clearcoat Interior Color: Black Engine: 5.7L 8 CYLINDER Transmission: 8 Spd Automatic Drivetrain: Rear Wheel Drive Ram Installed Options & Packages For This *Ram* *1500* ENGINE: 5.7L V8 HEMI MDS VVT EZH - Hemi Badge Dual Rear Exhaust w/Bright Tips 180 Amp Alternator Heavy Duty Engine Cooling Active Noise Control System TRANSMISSION: 8-SPEED AUTOMATIC (850RE) DFT TRANSMISSION: 8-SPEED AUTOMATIC (8HP75) DFR QUICK ORDER PACKAGE 24Z BIG HORN/LONE STAR 24Z - Engine: 5.7L V8 HEMI MDS VVT Transmission: 8-Speed Automatic (8HP75) Steering Wheel Mounted Audio Controls 3.92 REAR AXLE RATIO DMH BRIGHT WHITE CLEARCOAT PW7 SPORT APPEARANCE PACKAGE AEF - Body Color Door Handles Tires: 275/55R20 OWL All Season Grille B/Color Outline 1 Texture 2 Body Color Rear Bumper w/Step Pads Exterior Mirrors Courtesy Lamps Black Interior Accents Auto Dim Exterior Driver Mirror Body Color Front Bumper Exterior Mirrors w/Supplemental Signals Exterior Mirrors w/Memory Power-Folding Mirrors Power Heated Fold-Away Mirrors BIG HORN LEVEL 1 EQUIPMENT GROUP A62 - Rear Window Defroster Cluster 3.5" TFT Color Display Power 8-Way Driver Seat Rear Power Sliding Window Sun Visors w/Illuminated Vanity Mirrors Glove Box Lamp Integrated Center Stack Radio Class IV Receiver Hitch Single Disc Remote CD Player Power 4-Way Driver Lumbar Adjust Power Adjustable Pedals Foam Bottle Insert (Door Trim Panel) Google Android Auto For More Info Call 800-643-2112 Exterior Mirrors Courtesy Lamps 1-Year SiriusXM Radio Service Auto Dim Exterior Driver Mirror Radio: Uconnect 4 w/8.4" Display SiriusXM Satellite Radio Exterior Mirrors w/Supplemental Signals Big Horn IP Badge Rear Dome w/On/Off Switch Lamp Universal Garage Door Opener Power Heated Fold Away Mirrors Rear View Auto Dim Mirror 8.4" Touchscreen Display Power-Folding Mirrors Apple CarPlay ANTI-SPIN DIFFERENTIAL REAR AXLE DSA Ram About Us Peterson Motorcars CORPORATE OFFICES 1844 Church St West Palm Beach, FL 33409 Call NOW to Reserve this Ram_ 1500_! 561-693-0621Text NOW to Reserve this Ram_ 1500_! 561-203-4849 *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *PETERSON MOTORCARS USED TRUCKS FOR SALE WEST PALM BEACH FL 33409* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *For Sale* *Clean* *Bright White Clearcoat* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Cheap* *Like New* *Rear Wheel Drive* *5.7L 8 CYLINDER * *Used* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box* *Ram* *1500* *Big Horn/Lone Star 4x2 Crew Cab 6'4" Box*
## 384 2019 *Nissan* *Sentra* S CVT Sedan - $14,500Call or Text Us Today! 205-512-6569\t Nissan_ Sentra_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com Vehicle Description For This *Nissan* *Sentra*We finance | 1-Owner, clean Carfax - like brand NEW 2019 Nissan Sentra S CVT sedan! Only 22K miles & still covered under factory new car warranty. Excellent daily driver that is reliable & gets 37-mpg! We finance. Low rates. Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Nissan_ Sentra_ at: http://www.worldclassmotors.com/details-2019-nissan-sentra-s_cvt-used-3n1ab7ap9ky380549.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist Vehicle Details For This *Nissan* *Sentra* Year: 2019 Make: Nissan Model: Sentra Trim: S CVT VIN: 3N1AB7AP9KY380549 Stock#: 283833 Condition: Used Clear Title Miles: 22,546 Exterior Color: Gun Metallic Interior Color: Charcoal Engine: 1.8L 4 CYLINDER Transmission: CVT Drivetrain: Front Wheel Drive Nissan Features & Options For This *Nissan* *Sentra* Ext / Int Color Gun Metallic with Charcoal Cloth Interior Luxury Features Cruise Control Remote Trunk Lid Steering Wheel Radio Controls Telescoping Steering Wheel Tire Pressure Monitor Power Equipment Power Mirrors Power Steering Safety Features Child Proof Door Locks Driver's Air Bag Intermittent Wipers Passenger Air Bag Rear Defogger Roll Stability Control Side Air Bags Side Curtain Airbags Interior Center Arm Rest Clock Overhead Console Tachometer Vanity Mirrors Exterior Remote Fuel Door Sliding Rear Window Audio / Video AM/FM Bluetooth CD Player Reverse Camera Touch Screen Nissan About Us World Class Motors 1920 Decatur Highway Gardendale, AL 35071 Call or Text NOW to Reserve this Nissan_ Sentra_! 205-512-6569\t *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT* *For Sale* *Clean* *Gun Metallic* *Nissan* *Sentra* *S CVT* *Cheap* *Like New* *Front Wheel Drive* *1.8L 4 CYLINDER* *Used* *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT* *Nissan* *Sentra* *S CVT*
## 470 2019 *Chevrolet* *Silverado 2500HD* CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD Truck - $41,800Call Us Today! 601-374-6258Chevrolet_ Silverado 2500HD_ For Sale by George Carr Buick GMC Vehicle Description For This *Chevrolet* *Silverado 2500HD*4X4 2500HD DURAMAX DIESELVERY VERY CLEAN, ONE OWNER. OFF LEASE. COMPLETELY SERVICED AND INSPECTED BY OUR GM CERTIFIED TECHNICIANS. B&W GOOSENECK HITCH. BUILT IN BRAKE CONTROLLER. NEW COOPER DISCOVER TIRES. READY TO GET DOWN TO WORK. FINANCING AVAILABLE (W.A.C.). View additional pictures and details This Chevrolet_ Silverado 2500HD_ at: http://www.georgecarrworktrucks.com/details-2019-chevrolet-silverado_2500hd-crew_cab_4x4_duramax_diesel_2500_2500hd-used-1gc1ksey9kf121232.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist Vehicle Details For This *Chevrolet* *Silverado 2500HD* Year: 2019 Make: Chevrolet Model: Silverado 2500HD Trim: CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD VIN: 1GC1KSEY9KF121232 Stock#: P15011 Condition: Used Clear Title Miles: 80,910 Exterior Color: Summit White Interior Color: Jet Black/Medium Ash Gray Piping and Stitching Engine: 6.6L 8 CYLINDER TURBOCHARGED Transmission: 6 Spd Automatic Drivetrain: Four Wheel Drive Chevrolet Installed Options & Packages For This *Chevrolet* *Silverado 2500HD* LT PREFERRED EQUIPMENT GROUP 1LT - Standard Equipment Chevrolet About Us George Carr Buick GMC Contact: ROBERT LANDRY 2950 S. Frontage Rd. Vicksburg, MS 39180 Call NOW to Reserve this Chevrolet_ Silverado 2500HD_! 601-374-6258 *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *4X4 2500HD DURAMAX DIESEL* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *For Sale* *Clean* *Summit White* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Cheap* *Like New* *Four Wheel Drive* *6.6L 8 CYLINDER TURBOCHARGED* *Used* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD* *Chevrolet* *Silverado 2500HD* *CREW CAB 4X4 DURAMAX DIESEL 2500 2500HD*
## 485 2018 *Jeep* *Compass* Latitude FWD SUV - $18,700Call or Text Us Today! 205-512-6569\t Jeep_ Compass_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com Vehicle Description For This *Jeep* *Compass*Like brand NEW 1-Owner, clean Carfax 2018 Jeep Compass Latitude! Local new car trade-in. Loaded with navigation, backup camera, power driver seat, keyless alarm, premium audio, Sirius XM satellite radio, Bluetooth, music interphase & more. Serviced, inspected & comes with warranty. We finance. Low rates! Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Jeep_ Compass_ at: http://www.worldclassmotors.com/details-2018-jeep-compass-latitude_fwd-used-3c4njcbb4jt108564.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist Vehicle Details For This *Jeep* *Compass* Year: 2018 Make: Jeep Model: Compass Trim: Latitude FWD VIN: 3C4NJCBB4JT108564 Stock#: 283886 Condition: Used Clear Title Miles: 18,316 Exterior Color: Redline Pearlcoat Interior Color: Black Engine: 2.4L 4 CYLINDER Transmission: 6 Spd Automatic Drivetrain: Front Wheel Drive Jeep Installed Options & Packages For This *Jeep* *Compass* ENGINE: 2.4L I4 PZEV M-AIR W/ESS EDE TRANSMISSION: 6-SPEED AISIN F21-250 GEN 3 AUTO DF7 QUICK ORDER PACKAGE 28J 28J - Engine: 2.4L I4 PZEV M-Air w/ESS Transmission: 6-Speed Aisin F21-250 Gen 3 Auto REDLINE PEARLCOAT PRM NAVIGATION GROUP AMA - USB Host Flip Google Android Auto Premium Air Filter For More Info Call 800-643-2112 Radio: Uconnect 4C Nav w/8.4" Display Integrated Center Stack Radio 1-Year SiriusXM Radio Service SiriusXM Satellite Radio GPS Antenna Input Air Conditioning ATC w/Dual Zone Control Apple CarPlay Humidity Sensor POPULAR EQUIPMENT GROUP ANF - Cluster 7.0" Color Driver Info Display 115V Auxiliary Power Outlet 7.0" Touch Screen Display Remote Start System Rear View Auto Dim Mirror Power 8-Way Driver/Manual 6-Way Passenger Seats 4-Way Power Lumbar Adjust POWER 8-WAY DRIVER/MANUAL 6-WAY PASSENGER SEATS JPR - 4-Way Power Lumbar Adjust Jeep About Us World Class Motors 1920 Decatur Highway Gardendale, AL 35071 Call or Text NOW to Reserve this Jeep_ Compass_! 205-512-6569\t *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD* *For Sale* *Clean* *Redline Pearlcoat* *Jeep* *Compass* *Latitude FWD* *Cheap* *Like New* *Front Wheel Drive* *2.4L 4 CYLINDER* *Used* *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD* *Jeep* *Compass* *Latitude FWD*
## 850 2018 *Toyota* *Highlander* XLE V6 FWD SUV - $28,900Call or Text Us Today! 205-512-6569\t Toyota_ Highlander_ For Sale by World Class MotorsFor Financing - step 1 is to complete our short online application @ WorldClassApproval.com Vehicle Description For This *Toyota* *Highlander*1-Owner, clean Carfax - like brand NEW 2018 Toyota Highlander V6 XLE finished in pearl white over black premium leather interior! Features include a 3.5L V6 engine, full power heated leather seats, 3rd row, navigation, backup camera, blind spot assist, rear climate control, power sunroof, power trunk, premium audio, Sirius XM satellite radio, Bluetooth, music interphase, premium alloys, rear bucket seats & more! Fully serviced, inspected & comes with warranty. We finance. Low competitive rates! Call or text 256-595-9403 for more info. Apply now @ WorldClassApproval.com. Trades welcome. Shipping available. 1920 Decatur Hwy, Gardendale, AL.View additional pictures and details This Toyota_ Highlander_ at: http://www.worldclassmotors.com/details-2018-toyota-highlander-xle_v6_fwd-used-5tdkzrfh9js546309.html?utm_source=craigslist&utm_medium=referral&utm_campaign=ebizautos_craigslist Vehicle Details For This *Toyota* *Highlander* Year: 2018 Make: Toyota Model: Highlander Trim: XLE V6 FWD VIN: 5TDKZRFH9JS546309 Stock#: 283875 Condition: Used Clear Title Miles: 63,061 Exterior Color: Blizzard Pearl Interior Color: Black Engine: 3.5L V6 CYLINDER Transmission: 8 Spd Automatic Drivetrain: Front Wheel Drive Toyota Installed Options & Packages For This *Toyota* *Highlander* BLIZZARD PEARL 070 SECURITY SYSTEM V5 Toyota About Us World Class Motors 1920 Decatur Highway Gardendale, AL 35071 Call or Text NOW to Reserve this Toyota_ Highlander_! 205-512-6569\t *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD* *For Sale* *Clean* *Blizzard Pearl* *Toyota* *Highlander* *XLE V6 FWD* *Cheap* *Like New* *Front Wheel Drive* *3.5L V6 CYLINDER* *Used* *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD* *Toyota* *Highlander* *XLE V6 FWD*
## state lat long posting_date
## 16 al 26.70385 -80.08200 2020-11-25T11:52:03-0600
## 384 al 33.66960 -86.81762 2020-11-28T10:01:51-0600
## 470 al 32.33205 -90.85716 2020-11-26T19:10:40-0600
## 485 al 33.66960 -86.81762 2020-11-26T10:12:22-0600
## 850 al 33.66960 -86.81762 2020-11-22T04:40:36-0600
Ako môžeme vidieť, v atribúte description je častokrát uvádzaný aj rok daného automobilu, čiže prichádza du úvahy, že by sme mohli tento rok extrahovať namiesto týchto chýbajúcich hodnôt z tohto atribútu.
ggplot(data = data[!is.na(data$year),], aes(x=year)) +
geom_histogram(bins = 121, fill= 6, color="#ffffff") +
xlab("Rok") +
ylab("Frekvencia") +
scale_y_continuous(breaks = seq(0, 500000, by = 20000)) +
scale_x_continuous(breaks = seq(1900, 2021, by = 5)) +
theme(axis.text.x = element_text(angle = 90))
Môžme vidieť, že dáta rokov nepochádzajú z normálneho rozdelenia. Starších vozidiel (pod 1990) aj novších máme nedostatok, naopak - čo sa dalo aj očakávať, vozidiel z posledných troch dekád je najviac. Nájdu sa tu aj vozidlá staršie, ale tie sa prevažne inzerujäú na špecializovaných fórach, napríklad pre veterány.
Toto rozdelenie sa dalo predpokladať, kedže priemerný vek vozidla je 12 rokov.
Počet prázdnych hodnôt.
nrow(data[is.na(data$manufacturer),])
## [1] 18220
Atribút obsahuje prázdne celkovo až 18 tisíc prázdnych záznamov výrobcu, alebo sa dáta stratili pri exportoch.
Skúsime si vytvoriť histogram podľa frekvencie daných výrobcov:
group_manu <- data %>%
group_by(manufacturer) %>%
summarize(frequency = n())
group_manu[order(-group_manu$frequency),][1:10,]
## # A tibble: 10 x 2
## manufacturer frequency
## <chr> <int>
## 1 ford 79666
## 2 chevrolet 64977
## 3 toyota 38577
## 4 honda 25868
## 5 nissan 23654
## 6 jeep 21165
## 7 <NA> 18220
## 8 ram 17697
## 9 gmc 17267
## 10 dodge 16730
Vidíme že Amerika nesklamala, a rebríčku kraľuje domáci výrobca - Ford.
ggplot(data = group_manu, aes(x = manufacturer, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Výrobca") +
theme(axis.text.x = element_text(angle = 90)) +
scale_y_continuous(breaks = seq(0, 100000, by = 5000))
Počet prázdnych hodnôt:
nrow(data[is.na(data$model),])
## [1] 4847
Máme približne 5 tisíc záznamov, ktoré nemajú vyplnený stĺpec model. V nasledujúcej tabuľke síce vidíme že niektoré z dát obsahujú názvy modelov v popise inzerátu (description).
data[is.na(data$model),][1:5,]
## id
## 43 7232651921
## 206 7239785162
## 212 7239719006
## 368 7238346972
## 388 7238172849
## url
## 43 https://auburn.craigslist.org/ctd/d/ton-service-utility-trucks-ford-chevy/7232651921.html
## 206 https://bham.craigslist.org/cto/d/gainesville-2017-ram-4x4-for-sale/7239785162.html
## 212 https://bham.craigslist.org/cto/d/helena-2011-range-rover/7239719006.html
## 368 https://bham.craigslist.org/cto/d/kellyton-2007-mazda/7238346972.html
## 388 https://bham.craigslist.org/cto/d/birmingham-2005-range-rover/7238172849.html
## region region_url price year manufacturer model
## 43 auburn https://auburn.craigslist.org NA 2014 ram <NA>
## 206 birmingham https://bham.craigslist.org 18000 2017 ram <NA>
## 212 birmingham https://bham.craigslist.org 22000 2011 rover <NA>
## 368 birmingham https://bham.craigslist.org 3400 2006 mazda <NA>
## 388 birmingham https://bham.craigslist.org 6900 2005 rover <NA>
## condition cylinders fuel odometer title_status transmission VIN drive
## 43 <NA> <NA> diesel 0 clean automatic <NA> <NA>
## 206 <NA> <NA> gas 95000 clean automatic <NA> <NA>
## 212 <NA> 8 cylinders gas 79000 clean automatic <NA> 4wd
## 368 <NA> <NA> gas NA rebuilt automatic <NA> <NA>
## 388 <NA> <NA> gas NA clean automatic <NA> <NA>
## size type paint_color
## 43 <NA> other white
## 206 <NA> <NA> <NA>
## 212 <NA> <NA> black
## 368 <NA> <NA> <NA>
## 388 <NA> <NA> <NA>
## image_url
## 43 https://images.craigslist.org/00303_eLTsWH0uS84_0gw0co_600x450.jpg
## 206 https://images.craigslist.org/00n0n_cZ7RUc9IPIl_0t20CI_600x450.jpg
## 212 https://images.craigslist.org/00j0j_jesJuu1ztPT_0CI0t2_600x450.jpg
## 368 https://images.craigslist.org/00I0I_l2zYOK4B5Vz_0CI0t2_600x450.jpg
## 388 https://images.craigslist.org/00606_fJy6wrbrq4Z_0CI0t2_600x450.jpg
## description
## 43 All Trucks USA12106 Old River RdRockton, IL 61072Ask for: Craigslist SalesMain: (815) 624-1400Light Duty Service Trucks + Commercial Truck Super Store www.AllTrucksUSA.comPrice: Call for PricingDescription: ****SEE A TRUCK FOR SALE AND WANT MORE PHOTOS? JUST TYPE THE SIX DIGIT STOCK NUMBER IN THE SEARCH BAR AT www.AllTrucksUSA.com / OR CALL 815-624-1400 FOR PRICING OR QUICK ANSWERS TO ANY QUESTIONS. FINANCING & DELIVERY AVAILABLE****----------------------------------------------------------------------------------------------------------------------------------------------Stock# B59285 - 2013 FORD F350 2WD REGULAR CAB SERVICE TRUCK, 6.2L V8 GAS, AUTOMATIC, 11' KNAPHEIDE UTILITY BODY w/ FLIP UP STORAGE LIDS, LADDER RACK, CLOTH BUCKET SEATS, TILT STEERING, CRUISE CONTROL, A/C, AM/FM RADIO, TRACTION CONTROL, 14,000 lb GVW / 122,682 MILESStock# B76324 - 2013 FORD F350 4X4 REGULAR CAB SERVICE TRUCK, 6.7L V8 POWER STROKE TURBO DIESEL, AUTOMATIC, 9' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, POWER WINDOWS LOCKS AND MIRRORS, VINYL BUCKET SEATS, CRUISE CONTROL, TILT STEERING, A/C, AM/FM RADIO, 4WD, 14,000 lb GVW / 67,923 MILESStock# 142845 - 2015 CHEVY 3500HD 4X4 CREW CAB SERVICE TRUCK, 6.0L V8 GAS, AUTOMATIC, 9' KNAPHEIDE UTILITY BODY, DUALLY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CRUISE CONTROL, TILT STEERING, AM/RM RADIO, TRACTION CONTROL, 4WD, 13,200 lb GVW / 152,637 MILESStock# C97015 - 2011 FORD F250 2WD EXTENDED CAB SERVICE TRUCK, 6.2L V8 GAS & CNG FUEL, AUTOMATIC, 8' STEELWELD UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, CLOTH BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CRUISE CONTROL, DIFFERENTIAL LOCK, A/C, TRACTION CONTROL, LADDER RACK, 10,000 lb GVW / 176,743 MILES -- $19,950 \t\t\t\t\t \t\t\t\t\t\t \t\t\t\t \t\t\t\t\t\t\t\t \t\t\t\t\t\t\t\t \t\t\t\tStock# C71616 - 2012 FORD F350 2WD EXTENDED CAB SERVICE TRUCK, 6.2L V8 GAS, AUTOMATIC, 9' ETI UTILITY BODY w/ FLIP UP STORAGE LID, PINTLE HITCH, CLOTH BUCKET SEATS, POWER WINDOWS LOCKS AND MIRRORS, CRUISE CONTROL, CD PLAYER RADIO, TRACTION CONTROL, 13,300 lb GVW / 168,419 MILES Stock# 138752 - 2012 DODGE RAM 3500HD 4X4 REGULAR CAB SERVICE TRUCK, 5.7L HEMI V8 GAS, AUTOMATIC, 9' RAWSON KOENIG UTILITY BODY, HITCH RECEIVER, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, CD PLAYER RADIO, TOW HAUL, A/C, 4WD, DUALLY (DRW), 12,500 lb GVW / 157,065 MILESStock# 134948 - 2013 GMC SIERRA 3500HD 4X4 CREW CAB SERVICE TRUCK, 6.6L V8 DURAMAX TURBO DIESEL, AUTOMATIC, 9' ROYAL UTILITY BODY w/ FLIP TOP STORAGE, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, POWER WINDOWS LOCKS AND MIRRORS, CLOTH BUCKET SEATS, CRUISE CONTROL, AM/FM RADIO, PTO CAPABLE, TILT STEERING, 4WD, 13,200 lb GVW / 114,998 MILES /Stock# 225534 - 2011 GMC 2500HD 4X4 CREW CAB MECHANICS TRUCK, 6.6L V8 DURAMAX TURBO DIESEL, AUTOMATIC, 3,200 lb RKI CRANE, 8' KNAPHEIDE UTILITY BODY, 2 OUTRIGGERS, CLOTH BUCKET SEATS, CRUISE CONTROL, FACTORY BRAKE CONTROLLER, AM/FM RADIO, TRACTION CONTROL, EXHAUST BRAKE, 160,464 MILESStock# A75556 - 2016 FORD TRANSIT 3500 HD REGULAR CAB CUTAWAY SERVICE VAN, 3.7L V6 GAS, AUTOMATIC, 11' READING ENCLOSED UTILITY BODY, 10' INSIDE FLOOR LENGTH, 48.5'' FLOOR WIDTH, 4' 11'' STANDING HEIGHT, VINYL BUCKET SEATS, POWER WINDOWS MIRRORS AND LOCKS, A/C, CD PLAYER RADIO, BLUETOOTH, BACK-UP CAMERA, 2WD, DUALLY (DRW), 9,950 lb GVW / 131,719 MILES Stock# 305423 - 2013 GMC 3500 HD 4X4 SERVICE TRUCK, 6.0L VORTEC V8 GAS, AUTOMATIC, 8' STAHL UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, CLOTH BUCKET SEATS, TILT STEERING, POWER WINDOWS MIRRORS AND LOCKS, CRUISE CONTROL, CD PLAYER RADIO, A/C, DUALLY (DRW), 13,025 lb GVW / 182,382 MILESStock# 326792 - 2012 CHEVY 3500HD 2WD EXTENDED CAB ENCLOSED SERVICE TRUCK, 6.0L VORTEC V8 GAS, 8' ENCLOSED BRAND FX UTILITY BODY, 4' FLOOR WIDTH, 6' STANDING HEIGHT, POWER INVERTER, PINTLE HITCH, FACTORY BRAKE CONTROLLER, VINYL BUCKET SEATS, CRUISE CONTROL, TILT STEERING, CD PLAYER, A/C, TRACTION CONTROL, SINGLE REAR WHEEL (SRW), 10,000 lb GVW / 160,413 MILESStock# 139072 - 2012 CHEVY 2500HD 4X4 EXTENDED CAB SERVICE TRUCK, 6.0L VORTEC V8 GAS, AUTOMATIC, 9' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, LADDER RACK, CLOTH BUCKET SEATS, TILT STEERING, A/C, CRUISE CONTROL, RADIO, TRACTION CONTROL, 4WD, SINGLE REAR WHEEL (SRW), 9,500 lb GVW / 223,064 MILESStock# B66477 - 2011 FORD F350 2WD REGULAR CAB SERVICE TRUCK, 6.7L POWERSTROKE TURBO DIESEL, AUTOMATIC, 9' ETI UTILITY BODY, HITCH RECEIVER, FACTORY BRAKE CONTROLLER, VINYL BUCKET SEATS, TILT STEERING, CRUISE CONTROL, A/C, CD PLAYER RADIO, TRACTION CONTROL, DUALLY (DRW), 13,300 lb GVW / 206,192 MILES Stock# 237861 - 2012 GMC 2500 HD 4X4 EXTENDED CAB SERVICE TRUCK, 6.0L VORTEC V8 GAS, AUTOMATIC, 8' KNAPHEIDE UTILITY BODY, HITCH RECEIVER, CLOTH BUCKET SEATS, A/C, CRUISE CONTROL, AM/FM RADIO, TOW HAUL, 4WD, SINGLE REAR WHEEL (SRW), 9,500 lb GVW / 167,757 MILESAll Trucks USAAsk for: Craigslist Sales☎ (815) 624-140012106 Old River Rd Rockton, IL 61072Basic Information:Year: 2014Make: FordModel: FORD CHEVY GMC DODGE RAMStock Number: 333333Condition: UsedType: Service, UtilityClass: Class 3 (10,001-14,000 Lbs.)Color: WHITEMileage: 0Cab Type: REGULAR / STANDARDPassengers: 3Body Type: Regular CabTrim: POWERSTROKE DURAMAX CUMMINS DIESEL AND GASCondition:Title: ClearEngine:Fuel Type: DieselEngine Make: FordEngine Description: 6.2L V8DriveTrain:Transmission Type: AutomaticSuspension:Suspension Type: SpringInstrumentation:TachometerTrip OdometerIn Car Entertainment:CD PlayerAM/FM StereoSeats:Seat Upholstery: VinylSeat Type: BucketConvenience:Power Door LocksPower WindowsPower SteeringPower MirrorsTilt Steering WheelTilt/Telescoping Steering WheelCruise ControlComfort:Air Conditioning**Great selection of commercial trucks. All trucks go through our mega service center and are ready to start working and making you money. For quick answers to any questions you may have please call 815-624-1400 . We are here to help, delivery available. You can also visit our online showroom at www.AllTrucksUSA.com A27FBAFAEA464DBBB650D168074EE06C 16922656 8931296
## 206 Great truck everything works ding on back finder 4x4 works great hwy miles $18,000 cash 💸 cell # show contact info
## 212 ***(Runs like new )... -New motor rebuilt (less than 400 miles since rebuilt ).! Didn’t need complete rebuild but timing chain had to be replaced so went ahead and rebuild motor. Around $7500 -New suspension pump and sensors $2500 -Brand new tires (All terrain Nitto Terra Grappler ). $1500 New battery $250 Loaded (navigation , back up camera ,xm radio , sunroof , Bluetooth, entertainment etc NO LOWBALLERS!!! Always an Alabama truck so zero rust ! Call or text 205281203six Basically a new Rover except not $85k sticker price Serious buyers not tire kickers, serious buyers can drive suv to mechanich of their preference and verify mechanical condition. Won’t be disappointed. No scammers or help needed to sell it. Clean and clear Alabama title ready to go. Great condition only selling to buy a new one
## 368 2007 Mazda 6 automatic 4 cylinder 120,000 miles on it gas saver Alabama rebuilt title runs great Good tires Ac & heat work Asking $3400 OBO
## 388 2005 Range Rover HSE, 4X4 , V8 , only 102k miles , very good running suv ! , loaded , leather , sunroof, navigation, back up sensors, towing package , cold AC , heater works , clean title $6500 Cash no traded , no finance show contact info
## state lat long posting_date
## 43 al NA NA 2020-11-17T14:55:59-0600
## 206 al 32.8210 -88.1589 2020-12-01T07:50:35-0600
## 212 al 33.2663 -86.9020 2020-12-01T00:27:55-0600
## 368 al 32.9791 -86.0484 2020-11-28T13:17:42-0600
## 388 al 33.4653 -86.8082 2020-11-28T09:07:31-0600
group_model <- data %>%
group_by(model) %>%
summarize(frequency = n())
group_model[order(-group_model$frequency),][1:10,]
## # A tibble: 10 x 2
## model frequency
## <chr> <int>
## 1 f-150 8370
## 2 silverado 1500 5964
## 3 <NA> 4847
## 4 1500 4211
## 5 camry 4033
## 6 accord 3730
## 7 altima 3490
## 8 civic 3479
## 9 escape 3444
## 10 silverado 3090
nrow(data[is.na(data$condition),])
## [1] 192940
Stĺpec stavu vozdila nemá vyplnený skoro polovica inzerátov na craigliste ヽ(°〇°)ノ. To bude docela prúser, nakoľko sme aj na základe tohto atribútu chceli určovať trendy cien vozidiel.
length(unique(data$condition))
## [1] 7
unique(data$condition)
## [1] "good" "excellent" NA "like new" "fair" "salvage"
## [7] "new"
Ide iba o rýchlu informáciu v akom stave auto je, pokiaľ prechádzame inzeráty.
group_cond <- data %>%
group_by(condition) %>%
summarize(frequency = n())
ggplot(data = group_cond, aes(x = condition, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Condition")
nrow(data[is.na(data$cylinders),])
## [1] 171140
Vidíme že skoro štvrtina záznamov nemá vyplnený atribút s počtom valcov.
length(unique(data$cylinders))
## [1] 9
unique(data$cylinders)
## [1] "8 cylinders" "4 cylinders" "6 cylinders" NA "10 cylinders"
## [6] "other" "5 cylinders" "3 cylinders" "12 cylinders"
group_cyl <- data %>%
group_by(cylinders) %>%
summarize(frequency = n())
ggplot(data = group_cyl, aes(x = cylinders, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Počet valcov")
nrow(data[is.na(data$fuel),])
## [1] 3237
Vidíme, že 3237 áut jazdí zadarmo. (✧ω✧)
length(unique(data$fuel))
## [1] 6
unique(data$fuel)
## [1] "gas" "diesel" "other" "hybrid" NA "electric"
group_fuel <- data %>%
group_by(fuel) %>%
summarize(frequency = n())
ggplot(data = group_fuel, aes(x = fuel, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Palivo")+
scale_y_continuous(breaks = seq(0, 500000, by = 20000))
Z histogramu vidíme, že väčšina áut ktoré sú inzerované su benzínové.
nrow(data[is.na(data$odometer),])
## [1] 55303
boxplot(data$odometer, las=3)
data[order(-data$odometer),][1:10,]
## id
## 380797 7229317000
## 153223 7237854582
## 21880 7225071349
## 29826 7239691085
## 30283 7240189151
## 31403 7238955004
## 54968 7239456493
## 61130 7240290163
## 68619 7240659360
## 75189 7240449192
## url
## 380797 https://elpaso.craigslist.org/ctd/d/el-paso-2005-gmc-sierra-2500hd-crew-cab/7229317000.html
## 153223 https://desmoines.craigslist.org/ctd/d/des-moines-2016-hyundai-veloster/7237854582.html
## 21880 https://littlerock.craigslist.org/cto/d/mount-ida-1980-jeep-cj-obo-need-to-sell/7225071349.html
## 29826 https://imperial.craigslist.org/cto/d/brawley-1963-ss-impala/7239691085.html
## 30283 https://inlandempire.craigslist.org/cto/d/riverside-1970-oldsmobile-442-got-it/7240189151.html
## 31403 https://inlandempire.craigslist.org/cto/d/moreno-valley-2013-nissan-rogue/7238955004.html
## 54968 https://sandiego.craigslist.org/esd/cto/d/descanso-fire-truck-for-sale/7239456493.html
## 61130 https://sfbay.craigslist.org/sby/cto/d/watsonville-2003-dodge-stratus-only-82/7240290163.html
## 68619 https://cosprings.craigslist.org/cto/d/colorado-springs-1952-ford-victoria/7240659360.html
## 75189 https://pueblo.craigslist.org/cto/d/avondale-1987-jeep-grand-cherokee-laredo/7240449192.html
## region region_url price year
## 380797 el paso https://elpaso.craigslist.org 16995 2005
## 153223 des moines https://desmoines.craigslist.org 12995 2016
## 21880 little rock https://littlerock.craigslist.org 4250 1980
## 29826 imperial county https://imperial.craigslist.org 8900 1963
## 30283 inland empire https://inlandempire.craigslist.org 6000 1970
## 31403 inland empire https://inlandempire.craigslist.org 5800 2013
## 54968 san diego https://sandiego.craigslist.org 4000 1977
## 61130 SF bay area https://sfbay.craigslist.org 4350 2003
## 68619 colorado springs https://cosprings.craigslist.org 4500 1952
## 75189 pueblo https://pueblo.craigslist.org 1500 1987
## manufacturer model condition cylinders fuel
## 380797 gmc sierra 2500hd excellent 8 cylinders diesel
## 153223 hyundai veloster excellent 4 cylinders gas
## 21880 jeep cj5 fair 6 cylinders gas
## 29826 chevrolet impala good other gas
## 30283 <NA> oldsmobile 442Oldsmobile fair 8 cylinders gas
## 31403 nissan rogue excellent 4 cylinders gas
## 54968 <NA> Hendrickson Truck <NA> <NA> diesel
## 61130 dodge <NA> <NA> <NA> gas
## 68619 ford victoria <NA> <NA> gas
## 75189 jeep grand cherokee good 6 cylinders gas
## odometer title_status transmission VIN drive size
## 380797 2043755555 clean automatic 1GTHK23295F846900 4wd full-size
## 153223 123459789 clean automatic KMHTC6AD5GU274883 fwd <NA>
## 21880 10000000 clean manual <NA> 4wd <NA>
## 29826 10000000 missing automatic <NA> rwd full-size
## 30283 10000000 clean automatic <NA> rwd <NA>
## 31403 10000000 clean automatic <NA> fwd mid-size
## 54968 10000000 clean other <NA> <NA> <NA>
## 61130 10000000 clean automatic <NA> <NA> <NA>
## 68619 10000000 clean automatic <NA> <NA> <NA>
## 75189 10000000 clean automatic <NA> 4wd full-size
## type paint_color
## 380797 pickup white
## 153223 sedan <NA>
## 21880 <NA> black
## 29826 coupe blue
## 30283 coupe <NA>
## 31403 SUV white
## 54968 <NA> <NA>
## 61130 <NA> <NA>
## 68619 <NA> <NA>
## 75189 SUV white
## image_url
## 380797 https://images.craigslist.org/00w0w_lfHv2x91qs9_09G07g_600x450.jpg
## 153223 https://images.craigslist.org/00r0r_aRISNNafNmW_0ak07K_600x450.jpg
## 21880 https://images.craigslist.org/00D0D_4KJ1YF78MlF_0CI0t2_600x450.jpg
## 29826 https://images.craigslist.org/00a0a_a808owBmYbU_0CI0t2_600x450.jpg
## 30283 https://images.craigslist.org/00303_iipxxkO4Ch9_0CI0t2_600x450.jpg
## 31403 https://images.craigslist.org/00l0l_3o0WAhg5cC1_0CI0t2_600x450.jpg
## 54968 https://images.craigslist.org/00M0M_jWVD1HzcMMc_0CI0lM_600x450.jpg
## 61130 https://images.craigslist.org/01717_2itBUkVEBnf_0CI0t2_600x450.jpg
## 68619 https://images.craigslist.org/00q0q_O9i1exO5vk_0x20t2_600x450.jpg
## 75189 https://images.craigslist.org/00f0f_62UrK7Bd7wQ_0lM0t2_600x450.jpg
## description
## 380797 Melendez Auto Sales Inc. 7725 Alameda Ave 7712 Alameda Ave, El Paso, TX 79915Or use the link belowto view more information!http://WWW.MELENDEZAUTOSALES.COM💬💬💬 HABLAMOS ESPAÑOL. 💬💬💬Para mas informacion, llamar al ☎ (915) 772-0020 / 91577840142005 GMC Sierra 2500HD Crew Cab 153 WB 4WD SLE Pickup 2,043,755,555 miles / / / Call (or Text) (915) 778−4014 for quick answers to your questions about this GMC Sierra 2500HD Crew Cab 153 WB 4WD SLE.***** GMC Sierra 2500HD Crew Cab 153 WB 4WD SLE Pickup *****2006, 2007, 2008, 2005, 2004, 2003, 2002, GMC, Sierra 2500HD, Envoy, Envoy XL, Safari, Savana 1500, [Model5]Disclaimer : Call or Text 915 727 4490Call (or text) ☏ (915) 778−4014Melendez Auto Sales Inc. 7725 Alameda Ave 7712 Alameda Ave, El Paso, TX 79915Or use the link belowto view more information!http://WWW.MELENDEZAUTOSALES.COM*GMC* *Envoy* *Envoy XL* *GMC* Safari* GMC* *Savana 1500* *Automatic* *Crew Cab 153 WB 4WD SLE* *GMC* *White* *Automatic* *Pickup* *6.6L 300.0hp* *4WD* *Melendez Auto Sales Inc.* *Call us today at (915) 778−4014* *GMC Sierra 2500HD Crew Cab 153 WB 4WD SLE Pickup 4WD 6.6L 300.0hp* *GMC* *Crew Cab 153 WB 4WD SLE* *GMC Sierra 2500HD Crew Cab 153 WB 4WD SLE Pickup 4WD 6.6L 300.0hp**GMC* *White* *Automatic* *Pickup* *6.6L 300.0hp* *4WD* *Call us today at (915) 778−4014* *GMC* *White* *Automatic* *Melendez Auto Sales Inc.* *Pickup* *6.6L 300.0hp* 2001 2000 1999 1998 1997 1996
## 153223 2016 HYUNDAI VELOSTER Sedan \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\tCall: (515) 262-9538 | Stock #: 74883 $12,995.00 \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\tTom's Auto Group \t\t\t\t\t\t\t\t2136 East University Ave. \t\t\t\t\t\t\t\tDes Moines, IA 50317 \t\t\t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\t\tOr use the link below to view more information! \t\t\t\t\t\t\t\thttps://tomsautogroup.com/used-2016-hyundai-veloster-v5364948.html \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\tYear : 2016 \t\t\t\t\t\t\tMake : HYUNDAI \t\t\t\t\t\t\tModel : VELOSTER \t\t\t\t\t\t\tTrim : \t\t\t\t\t\t\tMileage : 123,459,789 \t\t\t\t\t\t\tTransmission : Automatic \t\t\t\t\t\t\tExterior Color : Pacific Blue \t\t\t\t\t\t\tInterior Color : Black \t\t\t\t\t\t\tEngine : 1.6L 4 Cylinder \t\t\t\t\t\t\tVIN : KMHTC6AD5GU274883 \t\t\t\t\t\t\tStock # : 74883 \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\tDescription of this HYUNDAI VELOSTER Sedan \t\t\t\t\t\t\t2016 Hyundai Veloster Pacific Blue in color with Power Windows and Locks ,Tilt, Cruise, Paddle Shifter, Steering Wheel Audio Controls, Bluetooth, Cd player with AUX, Back up Camera, Fresh Detail, READY TO GO \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t\tOptional equipment of this HYUNDAI VELOSTER Sedan \t\t\t\t\t\t\t \t\t\t \t\t\t\t \t\t\t\t \t\t\t\t\tThis Vehicle Includes These Added Value Features! \t\t\t\t\t Back-Up Camera Bluetooth Connection \t\t\t\t \t\t\t \t\t\t\tVehicle Options: \t\t\t\t \t\t\t\t\tA/C Adjustable Steering Wheel Automatic Headlights Back-Up Camera Cargo Shade Cruise Control Driver Illuminated Vanity Mirror Driver Vanity Mirror Engine Immobilizer Heated Mirrors Intermittent Wipers Keyless Entry Passenger Illuminated Visor Mirror Passenger Vanity Mirror Power Door Locks Power Mirror(s) Power Steering Power Windows Rear Defrost Security System Steering Wheel Audio Controls Trip Computer Variable Speed Intermittent Wipers \t\t\t\t \t\t\t\t\tFront Wheel Drive Gasoline Fuel Transmission with Dual Shift Mode \t\t\t\t \t\t\t\t\tAM/FM Stereo Auxiliary Audio Input MP3 Player Satellite Radio \t\t\t\t \t\t\t\t\t4-Wheel Disc Brakes ABS Brake Assist Child Safety Locks Daytime Running Lights Driver Air Bag Front Head Air Bag Front Side Air Bag Passenger Air Bag Passenger Air Bag Sensor Rear Head Air Bag Stability Control Traction Control \t\t\t\t \t\t\t\t\tBucket Seats Cloth Seats Pass-Through Rear Seat Rear Bench Seat \t\t\t\t \t\t\t\t\tAluminum Wheels Tire Pressure Monitor \t\t\t\t \t\t\t\t\tBluetooth Connection \t\t\t\t \t\t\t \t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t \t\t\t\t\tCopy and Paste this link for more vehicle information: \t\t\t\t\thttps://tomsautogroup.com/used-2016-hyundai-veloster-v5364948.html \t\t\t\t\t \t\t\t\t\tCall: (515) 262-9538 to get the best price! \t\t\t\t\t \t\t\t\t\tTom's Auto Group \t\t\t\t\t2136 East University Ave. \t\t\t\t\tDes Moines, IA 50317 \t\t\t\t\t \t\t\t\t\t2016 HYUNDAI VELOSTER for sale in Des Moines
## 21880 Looking to get rid of my 1980 CJ5 its a solid jeep ready to hunt would be good on a lease or a deer camp. It runs good, it is spray on bedlined inside and out. It is definitely not a perfect jeep inline 6cyl 4speed manual shifts out good 4wd drive works good has lock out hubs. Drivers seat is in good condition the passenger seat needs recovered or replaced. I have 2 other sets of seats and a back seat for it but it needs the rear seat mount I just kept it sitting on floor for the kids to sit on when riding dirt rds. It will need brake work eventually, and steeeing box leaks I've had it 2 yrs and drive it like it is but you have to pump the brakes to stop. I have replaced the old coil type ignition with a GM style HEI distributor and it has a newer carburetor It comes with a full soft top w/doors and a bikini top both are in very good condition. Tires are like new 235 75 15 XL. The 8000lb Ramsey winch works as it should also. Asking $4250 obo Again Its not a perfect jeep but it runs drives and stops fine for a hunting ride. Texting or email is the best way to contact me. Phone calls are spotty at best and may not be returned due to poor cell service. I will deal in person cash only no paypal, no agent pick ups or whatever and I don't need help selling
## 29826 PATINA! 63 SS IMPALA project car. Straight Body, Og paint, No motor or Tranny ,No Bondo or primer , front end included, trunk , glass, No seats, Does have dent on passenger lower front 1/4 panel & rocker. needs floorpans, No Title, Bill of Sale only. $8900
## 30283 I got it running and it smokes the tires . Real deal 1970 442 , needs restored has 350 motor in it now but comes with a 455 and extra transmission. Has a aftermarket ram air hood and stock hood. Tilt wheel , AC , and disc brakes. Uncut dash. Dual exhaust. I believe it is Viking blue with white interior and had a white vinyl top. It has Soft -Ray Windows $6000 OBO show contact info It started up after about 10 minutes of working on it , sounds good with the dual exhaust Cutlass SS Chevelle GTO GS 1968 1969 1971 1972
## 31403 2013 Nissan Rogue 2.5L Air condition power locks doors power windows Alarm key less enter Alloys wheels
## 54968 Mid to Late 1970's Fire truck for Sale. the model of truck is a Hendrickson 1871-S. (If you google that model name, you will have a idea on what it looks like) Runs good with lights and siren and PA system still working. It has a 671 Detroit diesel engine with a 5 speed transmission.an approx. 50K miles. Tires in good condition as well. It has some fire hoses along with the water cannon has a hose with it too. purchased it a couple of years ago but have not got around to getting all the other equipment with it. So now I have to part with it. Oh, it does also have a ladder and all the pump valves have been overhauled too. If interested, contact me or text me at show contact info for additional answers to your questions. Asking $4000.00 O.B.O.
## 61130 Espanol / English show contact info No codes needed Post will be removed when gone No low ballers 2003 Stratus R/T 30 6 cyl. 3.0 motor only 82.000 miles Automatic Red on black Clean interior 4 CD changer Stereo with pleasent sound. Dent on fender Front new tires Clean Title Just smoged Just regestered 4,300 priced for fast sale Only 82, k miles Trades welcome ( no junk ) Serious buyer only show contact info 1998 1999 2000 2001 2002 2004 2005 2006 2007 1941 1940 1942 1939
## 68619 52 Victoria. Flathead V-8, automatic. Runs, drives, stops, but needs work to be road worthy. $4,500.00 or best offer. Trades considered. Call or text Only!! NO emails! seven19-338-411seven No help needed selling this car!
## 75189 Straight 6, 4x4 ,auto trans body in very good condition.
## state lat long posting_date
## 380797 tx 31.73213 -106.36604 2020-11-11T13:45:37-0700
## 153223 ia 41.60098 -93.58129 2020-11-27T14:36:14-0600
## 21880 ar 34.56120 -93.57490 2020-11-03T17:10:09-0600
## 29826 ca 32.97400 -115.53460 2020-11-30T20:16:29-0800
## 30283 ca 33.94550 -117.37570 2020-12-01T16:30:35-0800
## 31403 ca 33.92200 -117.24900 2020-11-29T14:21:00-0800
## 54968 ca 33.05340 -116.56580 2020-11-30T12:16:19-0800
## 61130 ca 36.91020 -121.75690 2020-12-01T21:37:26-0800
## 68619 co 38.77687 -104.78125 2020-12-02T14:31:03-0700
## 75189 co 38.10250 -104.52980 2020-12-02T09:23:07-0700
data_gt_mil_odo <- data[!is.na(data$odometer) & data$odometer > 1000000,]
nrow(data_gt_mil_odo)
## [1] 388
Vidíme, že približne 400 záznamov má príliš vysoké hodnoty - máme za domienku, že sú to vymyslené dáta.
boxplot(data[data$odometer < 1000000,] $odometer, las=2)
Vidíme že po subsetovaní atribútu pod 1 milión míl sme dospeli k realistickejším dátam. Vidíme že väčšina hodnôt sa pohybuje v rozpätí od cca 45 000 až po 150 000. Medián sa pohybuje niečo pod 100k.
nrow(data[is.na(data$transmission),])
## [1] 2442
length(unique(data$transmission))
## [1] 4
unique(data$transmission)
## [1] "other" "automatic" "manual" NA
Vidíme že ide o kategorický atribút vyjadrujúci typ prevodovky.
group_trans <- data %>%
group_by(transmission) %>%
summarize(frequency = n())
ggplot(data = group_trans, aes(x = transmission, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Prevodovka")+
scale_y_continuous(breaks = seq(0, 500000, by = 20000))
Vidíme že sa potvrdzuje fakt, že Američania nevedia riadiť manuál a že najlepším bezpečnostným systémom áut je manuálna prevodovka.
What is that stick?
nrow(data[is.na(data$VIN),])
## [1] 187572
Kategorický atribút označujúci výrobné číslo vozidla.
nrow(data[is.na(data$drive),])
## [1] 134188
length(unique(data$drive))
## [1] 4
unique(data$drive)
## [1] "rwd" "fwd" NA "4wd"
group_drv <- data %>%
group_by(drive) %>%
summarize(frequency = n())
ggplot(data = group_drv, aes(x = drive, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Pohon")+
scale_y_continuous(breaks = seq(0, 500000, by = 20000))
Vidíme že väčšina z predávaných je poháňaná náhonom na všetky štyri kolesá. Veľkou skupinou dát sú aj dáta s neuvedenou hodnootu tohto atribútu.
nrow(data[is.na(data$size),])
## [1] 321348
length(unique(data$size))
## [1] 5
unique(data$size)
## [1] NA "full-size" "mid-size" "compact" "sub-compact"
group_size <- data %>%
group_by(size) %>%
summarize(frequency = n())
ggplot(data = group_size, aes(x = size, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Velkosť")+
scale_y_continuous(breaks = seq(0, 500000, by = 20000))
Size vyjadruje veľkosť vozidla. Asi je to optional atribút na craigsliste, pretože ho nemá väčšina záznamov.
nrow(data[is.na(data$type),])
## [1] 112738
length(unique(data$type))
## [1] 14
unique(data$type)
## [1] "other" "sedan" "SUV" "pickup" "coupe"
## [6] "van" NA "truck" "mini-van" "wagon"
## [11] "convertible" "hatchback" "bus" "offroad"
group_type <- data %>%
group_by(type) %>%
summarize(frequency = n())
ggplot(data = group_type, aes(x = type, y = frequency)) +
geom_bar( stat = "identity", fill= 2, color="#ffffff") +
ylab("Frekvencia") +
xlab("Typ")+
scale_y_continuous(breaks = seq(0, 500000, by = 20000)) +
theme(axis.text.x = element_text(angle = 90))
ggplot(data, aes(x = fuel, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).
Podľa boxplotu máme možnosť vidieť, že pri dieslových motoroch sa autá predávajú s vyšíím počtom najazdených míľ. Čo nás prekvapilo, je napríklad pomer hybridných a benzínových (gas) motorov. Domnievame sa však, že veľa dát je taktiež ešte nezatriedených v dátach s prázdnym atribútom fuel.
ggplot(data, aes(x = drive, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).
ggplot(data, aes(x = transmission, y = odometer)) + ylim(10,500000) + geom_boxplot()
## Warning: Removed 60218 rows containing non-finite values (stat_boxplot).
ggplot(data, aes(x = fuel, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).
ggplot(data, aes(x = drive, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).
ggplot(data, aes(x = transmission, y = price)) + ylim(10,300000) + geom_boxplot()
## Warning: Removed 35401 rows containing non-finite values (stat_boxplot).
ggplot(data, aes(x=transmission, y=odometer, color=fuel)) +
ylim(10,500000) +
geom_point(size=6) +
theme_bw()
## Warning: Removed 60218 rows containing missing values (geom_point).
ggplot(data = data, aes(x = transmission,y = price, shape = drive, colour= fuel)) + ylim(10,300000) + geom_jitter(size = 4) + xlab("Prevodovka")
## Warning: Removed 157887 rows containing missing values (geom_point).
pairs(~price+odometer+year, data = data)
V datasete sme identifikovali viacero problémov, medzi ne patria:
Do tejto kategórie spadájú kvantitatívne atribúty ako sú odometer, cena a rok. Pri odometri máme viacerých outlierov, ktorí nám pripadajú, že sú ich hodnoty buď vymyslené alebo vznikli chybou vyplnenia. Pri cene to je také isté, niektoré modely majú prevýšenú cenu niekoľko krát, veľa inzerátov má zase nulovú cenu. Problémom pri týchto dátach ktoré máme je, že nie je jednoduché tieto dáta doplniť alebo ich nahradiť. Jednotlivé inzeráty by bolo potrebné rozdeliť podla značky, následne podľa modelu, roku, najazdených kilometrov atď., pretože cena závisí od týchto parametrov. Takýmto krokom by sme si mohli aj sami zaviesť nepresnosti do dát, čím by mohli byť skreslené. Preto sa z našeho pohľadu neoplatí vkladať značný effort na doplnenie týchto dát a radšej dané záznamy vymažeme.
Viaceré atribúty ako sú model, výrobca, prevodovka majú prázdne hodnoty. Buď je to zle vyplnené užívateľom, alebo crawler ktorý dáta sťahoval mal implementačnú chybu. Niektoré atribúty by bolo možné opraviť, ale bolo by potrebné špecifické dáta, ktoré sme k dispozícii nenašli. Ide o to, že nie každý model ma všetky dostupné konfigurácie. Niektoré modely nemusia mať manuálne prevodovky pokiaľ majú pohon predných kolies a podobne. Preto bude najjednoduchšie tieto záznamy odstrániť.
Prázdne hodnoty sme detekovali skoro v každom atribúte. Pri niektorých atribútoch nemáme možnosť ako ich doplniť - model, výrobca, palivo atď, pretože nepoznáme konfiguráciu vozidla. Pri niektorých to ani nemá význam, pretože ich majitľl nezadal a nevieme odhadnúť skutočný popis vozidla - stav, typ karosérie, vin číslo. Preto tieto hodnoty budeme brať ako plus a možno majú dôvod vyššej ceny.
Prevažne pri všetkých typoch problémov je pre nás najjednoduchšie odstrániť tieto záznamy. Pri atribútoch, ktoré nie sú vyplnené z dôvodu toho, že majiteľ ich nezadal, budeme k nim pristupovať ako k možnej príčine vyššej ceny. Niektoré prázdne alebo nedefinované hodnoty môžeme rozdeliť medzi ostatné atribúty, čím rozloženie ostane zostane rovnaké - napríklad typ prevodovky.
Budeme postupovať metódou hypothesis-driven. Budeme sa snažiť overiť naše stanovené hypotézy a overiť ich pravdivosť a v závere zhodnotiť.
nrow(data[duplicated(data[,3:19]),])
## [1] 55473
data[c(60,80),]
## id
## 60 7229265094
## 80 7226011186
## url
## 60 https://auburn.craigslist.org/ctd/d/sacramento-2014-subaru-impreza-20i/7229265094.html
## 80 https://auburn.craigslist.org/ctd/d/sacramento-2014-subaru-impreza-20i/7226011186.html
## region region_url price year manufacturer model
## 60 auburn https://auburn.craigslist.org 12998 2014 subaru impreza
## 80 auburn https://auburn.craigslist.org 12998 2014 subaru impreza
## condition cylinders fuel odometer title_status transmission
## 60 excellent <NA> gas 99598 clean automatic
## 80 excellent <NA> gas 99598 clean automatic
## VIN drive size type paint_color
## 60 JF1GPAU63E8325853 4wd <NA> wagon silver
## 80 JF1GPAU63E8325853 4wd <NA> wagon silver
## image_url
## 60 https://images.craigslist.org/00F0F_j7yIeuD1XCp_0cU09G_600x450.jpg
## 80 https://images.craigslist.org/00Y0Y_3nTaoVwOMG4_0cU09G_600x450.jpg
## description
## 60 2014 *** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon *** Drive it home today. Call (Or Text) us now !!Call (or text) ☏ (916) 778-3115 916 Auto Sales 4020 Marysville Blvd., Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://916autosales.v12soft.com/cars/13432713 \t\t\tYear : 2014\t\t\t\tMake : Subaru\t\t\t\tModel : Impreza\t\t\t\tTrim : 2.0i Sport Limited AWD 4dr Wagon\t\t\t\t Mileage : 99,598 miles\t\t\t\tTransmission : Automatic\t\t\t\tExterior Color : Silver\t\t\t\tInterior Color : Black\t\t\t\tSeries : 2.0i Sport Limited AWD 4dr Wagon Wagon\t\t\t\tDrivetrain : 4WD\t\t\t\tCondition : Excellent\t\t\t\tVIN : JF1GPAU63E8325853\t\t\t\tStock ID : 8325853\t\t\t\tEngine : 2.0L H4\t \tDescription of this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t \tMeet our versatile 2014 Subaru Impreza 2.0i Sport Limited 5 Door Hatchback shown in Ice Silver Metallic. Powered by a proven 2.0 Liter BOXER 4 Cylinder that provides 148hp while connected to an innovative CVT. This All Wheel Drive achieve incredible fuel economy of near 36mpg plus Subaru's lower center of gravity provides better handling and serves up that that sports car feel. Load up your friends and head out for some highway fun while looking sharp with 17-inch aluminum-alloy wheels, fog-lights, and raised roof rails. Inside our Limited, see plenty of room for your family or fishing buddies plus gear and the family dog. Slide behind the wheel and take in the ergonomic design filled with amenities to provide for a comfortable road trip that start with heated leather seating, Automatic climate control, and an amazing audio system. The exterior lines scream sports car while the interior provides ample space for cargo or passengers. Legendary Subaru reliability and exemplary safety features are abundant to protect the ones you love. No longer does an All Wheel Drive translate to the family truckster. Subaru offers instant power to the wheels, which will have you loving your Impreza at first drive. Print this page and call us Now... We Know You Will Enjoy Your Test Drive Towards Ownership! Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today pls call 916-888-6888 or 916-6041234 ....WE HAVE FINANCING AVAILABLE.. ~ NO CREDIT ~ NO PROBLEM ~ YOUR JOB IS YOUR CREDIT~ ~ EXTENDED WARRANTY AVAILABLE ~ ~ WE ACCEPT MOST INCOMES TYPES INCLUDING SSI & DISABILITY~ We over 60 cars and trucks to choose from. We have the perfect car for you We finance ~ YOUR JOB IS YOUR CREDIT~ Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today CLEAN TITLE CLEAN CARFAX SMOG IN HAND WE WORK WITH VARIOUS BANKS TO GET YOU APPROVED REGARDLESS OF YOUR CREDIT. BAD CREDIT NO PROBLEM NO CREDIT NO PROBLEM FIRST TIME BUYERNO PROBLEM NO LICENCENO PROBLEM WE WORK WITH MULTIPLE LENDERS TO COVER ALL TYPES OF CREDIT SHORT TIME AT THE JOBNO PROBLEM WARRANTY 3 MONTH OR 3000 MILES From the Third-party, IF QUALIFY Some Restriction may apply LOW to NO Down Payment (On Approved Credits) First Time Buyers Special Programs (Low Interest Rate, Low Monthly Payment, Low to No Down Payment) Hispanic Buyers Financing Programs (No Driver's License, ITIN numbers, Low to No Down Payment) Bad Credit/No Credit Financing Programs USAA Navy Federal Seawest Coast Guard Credit Union For more info ,PLS VISIT US @ WWW.916AUTOSALE.COM OR CALL US @ .....916 826-4043 , 916-604-1234 or 916-888-6888 4020 marysvile blvd sac ca 95838 . DISCLAIMER : Prices are subject to change without notice. Internet special prices might not reflect actual sale prices. Please contact our dealership for details. Price also does not include finance charges, finance fees, lender fees, taxes, gov. fees and other sale related charges. Call (or text) (916) 778-3115 for quick answers to your questions about this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon. 100% APPROVAL with OACWe Guarantee It! No matter your financial situation, you will drive off the lot in a new car, today.✅ Bad Credit✅ No Credit ✅ Repossession✅ SSI✅ Disability✅ Government AssistanceYOU'RE APPROVED!🚗 🚕 🚙 916 Auto Sales 🚗 🚕 🚙☎ CALL OR TEXT (916) 778-3115🔴 BAD CREDIT, GOOD CREDIT WE HAVE A VARIETY OF OPTIONS FOR YOU!!!🔵 IN-HOUSE FINANCING. 🔴 WITH OVER TWO-DOZEN LENDERS AVAILABLE, WE CAN PROVIDE A FINANCING SOLUTION TO MOST ANY CREDIT HISTORY.🔵 WARRANTY AVAILABLE🚘 TRADE/SELL/BUY ✅ GAP INSURANCE AVAILABLE ✅ FIRST TIME BUYER, CREDIT PROGRAM↪ CHECK OUT OUR INVENTORY AThttp://916autosales.v12soft.com/cars/13432713 ***** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ***** 2015, 2016, 2017, 2014, 2013, 2012, 2011, Subaru Impreza, Forester, Impreza, Legacy, Outback, Tribeca, Impreza Outback Sport, Impreza WRX STi, BRZ, XV Crosstrek, Impreza WRX, XV Crosstrek Hybrid, WRX, WRX STI Disclaimer : * Please Note All Inventory And Inventory Pricing Is Subject To Change Please See Dealer For Further Details * Drive it home today. Call (Or Text) us now !!Call (or text) ☏ (916) 778-3115 916 Auto Sales 4020 Marysville Blvd., Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://916autosales.v12soft.com/cars/13432713 2014 14 *Subaru* *Impreza* *Cheap 2.0i Sport Limited AWD 4dr Wagon* \t\t*Like New 2014 2.0i Sport Limited AWD 4dr Wagon Wagon* *2.0L H4* \t\t*Must See 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline - \t\t2014 Subaru Impreza impreza IMPREZA 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon Cheap - \t\t2014 Subaru Impreza (2.0i Sport Limited AWD 4dr Wagon) Carfax Gasoline 2.0L H4 - \t\t2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon 2.0L H4 Gasoline - \t\tSubaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon \t\t*SCHEDULE YOUR TEST DRIVE 2014 Subaru Impreza 2.0L H4 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon* \t\t*Subaru* *Impreza* 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\t*2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t\t*916 Auto Sales* *Call (or text) us today at (916) 778-3115.* \t\t2015 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon 2.0L H4 - \t\tHave you seen this 2016 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ? \t\tMust See 2017 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\t*For Sale Impreza* *Impreza* *Carfax 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\tCome test drive this amazing *Subaru* *Impreza* *(2.0I SPORT LIMITED AWD 4DR WAGON)* *Gasoline* Wagon 2.0i Sport Limited AWD 4dr Wagon Wagon Gasoline Wagon Gasoline* \t\t*(Subaru)* *(Impreza)* *2.0i Sport Limited AWD 4dr Wagon* *2.0L H4* *(GASOLINE)* *Bad Credit* \t\t*Gasoline* *Wagon* *Super Vehicle Gasoline Call (or text) this number (916) 778-3115* *2.0L H4* *916 Auto Sales* * Good Credit* \t\t2014 2013 2012 2011 \t\t*This vehicle is a used Subaru Impreza* *No Credit* \t\t*It is like New 2.0i Sport Limited AWD 4dr Wagon* *2.0L H4 Gasoline* \t\t*Gasoline* 2010 2009 2008 2007 2006 2005
## 80 2014 *** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon *** Ready To Upgrade Your Ride Today? We Make It Fast & Easy!Call (or text) ☏ (916) 619-1849 Motor Sports Sac 4020 Marysville Blvd, Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://motorsportsac.v12soft.com/cars/13442935 \t\t\tYear : 2014\t\t\t\tMake : Subaru\t\t\t\tModel : Impreza\t\t\t\tTrim : 2.0i Sport Limited AWD 4dr Wagon\t\t\t\t Mileage : 99,598 miles\t\t\t\tTransmission : Automatic\t\t\t\tExterior Color : Silver\t\t\t\tInterior Color : Black\t\t\t\tSeries : 2.0i Sport Limited AWD 4dr Wagon Wagon\t\t\t\tDrivetrain : 4WD\t\t\t\tCondition : Excellent\t\t\t\tVIN : JF1GPAU63E8325853\t\t\t\tStock ID : 8325853\t\t\t\tEngine : 2.0L H4\t \tDescription of this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t \tMeet our versatile 2014 Subaru Impreza 2.0i Sport Limited 5 Door Hatchback shown in Ice Silver Metallic. Powered by a proven 2.0 Liter BOXER 4 Cylinder that provides 148hp while connected to an innovative CVT. This All Wheel Drive achieve incredible fuel economy of near 36mpg plus Subaru's lower center of gravity provides better handling and serves up that that sports car feel. Load up your friends and head out for some highway fun while looking sharp with 17-inch aluminum-alloy wheels, fog-lights, and raised roof rails. Inside our Limited, see plenty of room for your family or fishing buddies plus gear and the family dog. Slide behind the wheel and take in the ergonomic design filled with amenities to provide for a comfortable road trip that start with heated leather seating, Automatic climate control, and an amazing audio system. The exterior lines scream sports car while the interior provides ample space for cargo or passengers. Legendary Subaru reliability and exemplary safety features are abundant to protect the ones you love. No longer does an All Wheel Drive translate to the family truckster. Subaru offers instant power to the wheels, which will have you loving your Impreza at first drive. Print this page and call us Now... We Know You Will Enjoy Your Test Drive Towards Ownership! Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today pls call 916-888-6888 or 916-6041234 ....WE HAVE FINANCING AVAILABLE.. ~ NO CREDIT ~ NO PROBLEM ~ YOUR JOB IS YOUR CREDIT~ ~ EXTENDED WARRANTY AVAILABLE ~ ~ WE ACCEPT MOST INCOMES TYPES INCLUDING SSI & DISABILITY~ We over 60 cars and trucks to choose from. We have the perfect car for you We finance ~ YOUR JOB IS YOUR CREDIT~ Guaranteed Approval ~ WITH ONLY 1500.00 Down and Drive Off Today CLEAN TITLE CLEAN CARFAX SMOG IN HAND WE WORK WITH VARIOUS BANKS TO GET YOU APPROVED REGARDLESS OF YOUR CREDIT. BAD CREDIT NO PROBLEM NO CREDIT NO PROBLEM FIRST TIME BUYERNO PROBLEM NO LICENCENO PROBLEM WE WORK WITH MULTIPLE LENDERS TO COVER ALL TYPES OF CREDIT SHORT TIME AT THE JOBNO PROBLEM WARRANTY 3 MONTH OR 3000 MILES From the Third-party, IF QUALIFY Some Restriction may apply LOW to NO Down Payment (On Approved Credits) First Time Buyers Special Programs (Low Interest Rate, Low Monthly Payment, Low to No Down Payment) Hispanic Buyers Financing Programs (No Driver's License, ITIN numbers, Low to No Down Payment) Bad Credit/No Credit Financing Programs USAA Navy Federal Seawest Coast Guard Credit Union For more info ,PLS VISIT US @ WWW.motorsportsac.com OR CALL US @ .....916 544 3125 916-604-1234 or 916-888-6888 4020 marysvile blvd sac ca 95838 . DISCLAIMER : Prices are subject to change without notice. Internet special prices might not reflect actual sale prices. Please contact our dealership for details. Price also does not include finance charges, finance fees, lender fees, taxes, gov. fees and other sale related charges. Call (or text) (916) 619-1849 for quick answers to your questions about this Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon. ⭐ Great Bank Financing Options Available ⭐✅ Bad Credit? ✅ No Credit? ✅ First Time Buyer? ✅ We Work With Dozens Of Lenders To Get You Approved Fast Regardless Of Your Credit Situation. 🚘 Ready To Get Behind The Wheel Of This Great Car 🚘 👉 Go to :100% APPROVAL with OACWe Guarantee It! No matter your financial situation, you will drive off the lot in a new car, today.✅ Bad Credit✅ No Credit ✅ Repossession✅ Bankruptcy✅ Foreclosure✅ SSI✅ Disability✅ Government AssistanceYOU'RE APPROVED!🚘 Motor Sports Sac 🚘 ✅ Huge Selection Of Quality Pre-Owned Cars ✅ Leave The Lot With Confidence Ask About Our Competitive Extended Warranties ✅ Trade-In Your Car Today For A Great Discount ✅ We Buy Cars Cash 📍 Stop By Today And See Why Our Dealership Is Always The People's Choice💥 Check Out More Of Our Great Cars On Craigslist Just Copy And Paste This Link Into Your Browser 💥 https://auburn.craigslist.org/search/ctd?query=motorsportsac.v12soft.com ***** Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ***** 2015, 2016, 2017, 2014, 2013, 2012, 2011, Subaru Impreza, Forester, Impreza, Legacy, Outback, Tribeca, Impreza Outback Sport, Impreza WRX STi, BRZ, XV Crosstrek, Impreza WRX, XV Crosstrek Hybrid, WRX, WRX STI Disclaimer : * Please Note All Inventory And Inventory Pricing Is Subject To Change Please See Dealer For Further Details * Ready To Upgrade Your Ride Today? We Make It Fast & Easy!Call (or text) ☏ (916) 619-1849 Motor Sports Sac 4020 Marysville Blvd, Sacramento, CA 95838Copy & Paste the URL belowto view more information!http://motorsportsac.v12soft.com/cars/13442935 2014 14 *Subaru* *Impreza* *Cheap 2.0i Sport Limited AWD 4dr Wagon* \t\t*Like New 2014 2.0i Sport Limited AWD 4dr Wagon Wagon* *2.0L H4* \t\t*Must See 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline - \t\t2014 Subaru Impreza impreza IMPREZA 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon Cheap - \t\t2014 Subaru Impreza (2.0i Sport Limited AWD 4dr Wagon) Carfax Gasoline 2.0L H4 - \t\t2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon 2.0L H4 Gasoline - \t\tSubaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon \t\t*SCHEDULE YOUR TEST DRIVE 2014 Subaru Impreza 2.0L H4 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon* \t\t*Subaru* *Impreza* 2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\t*2014 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon \t\t*Motor Sports Sac* *Call (or text) us today at (916) 619-1849.* \t\t2015 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon 2.0L H4 - \t\tHave you seen this 2016 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Wagon ? \t\tMust See 2017 Subaru Impreza 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\t*For Sale Impreza* *Impreza* *Carfax 2.0i Sport Limited AWD 4dr Wagon Gasoline Wagon \t\tCome test drive this amazing *Subaru* *Impreza* *(2.0I SPORT LIMITED AWD 4DR WAGON)* *Gasoline* Wagon 2.0i Sport Limited AWD 4dr Wagon Wagon Gasoline Wagon Gasoline* \t\t*(Subaru)* *(Impreza)* *2.0i Sport Limited AWD 4dr Wagon* *2.0L H4* *(GASOLINE)* *Bad Credit* \t\t*Gasoline* *Wagon* *Super Vehicle Gasoline Call (or text) this number (916) 619-1849* *2.0L H4* *Motor Sports Sac* * Good Credit* \t\t2014 2013 2012 2011 \t\t*This vehicle is a used Subaru Impreza* *No Credit* \t\t*It is like New 2.0i Sport Limited AWD 4dr Wagon* *2.0L H4 Gasoline* \t\t*Gasoline* 2010 2009 2008 2007 2006 2005
## state lat long posting_date
## 60 al 38.6411 -121.4286 2020-11-11T13:31:41-0600
## 80 al 38.6411 -121.4286 2020-11-05T13:48:15-0600
Môžme vidieť že v našom datasete, aj na pomerne veľkej vzorke atribútov sme našli veľký počet duplikátov. Takéto duplikáty mali síce rôzne atribúty ako url a id - domnievame sa teda, že boli predajcom nahraté do craigslistu viackrát a preto ich dropneme.
data <- data[!duplicated(data[,3:19]),]
typeof(data$price)
## [1] "double"
summary(data$price)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1 5995 12495 47355 22990 3615215112 28066
data <- data[!is.na(data$price),]
Okrem NA hodnôt sme sa rozhodli pozrieť na veľmi vysoké hodnoty ktoré sme videli v prieskumnej analýze. Vybrali sme preto horný a dolný threshold, a rozhodli sme sa vychýlené hodnoty sme odstrániť. Zvolili sme hraničnú hodnotu, nad ktorou nám prídu ceny áut nereálne a vymyslené.
top_threshold = 1111111
bottom_threshold = 500
data <- data[!data$price >= top_threshold,]
data <- data[!data$price <= bottom_threshold,]
boxplot(data$price, las=2)
ggplot(data = data, aes(sample=price)) +
stat_qq() +
stat_qq_line() +
scale_y_continuous(breaks = seq(0, 1000000, by = 50000))
percentil <- function (x) {
quantiles <- quantile( x, c(.05, .95 ))
x[ x < quantiles[1] ] <- quantiles[1]
x[ x > quantiles[2] ] <- quantiles[2]
return(x)
}
data$price <- percentil(data$price)
boxplot(data$price, las=2)
ggplot(data = data, aes(sample=price)) +
stat_qq() +
stat_qq_line() +
scale_y_continuous(breaks = seq(0, 5000000, by = 50000))
Zarovnaný koniec a začiatok ukazuje o tom ako 95 a 5 percentil funguje. Vychýlené hodnoty sme prevažne odstránili, čím distribúcia dát je lepšia ale stále dáta nepochádzajú z normálneho rozdelenia.
typeof(data$year)
## [1] "double"
data$year <- as.integer(data$year)
summary(data$year)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 1900 2007 2012 2010 2016 2021 955
Skúsime doplniť roky z atribútu description:
data[is.na(data$year),]$year <- apply(data[is.na(data$year),], MARGIN=1, FUN=function(row) {
year <- str_extract(row['description'], "([1-2][8,9,0]\\d{2})")
return(as.integer(year))
})
nrow(data[is.na(data$year),])
## [1] 18
Vidíme, že sme boli úspešní v doplnení a zvyšný, malý (18) počet dropneme.
data <- data[!is.na(data$year),]
ggplot(data = data, aes(x=year)) +
geom_histogram(bins = 121, fill= 6, color="#ffffff")
typeof(data$manufacturer)
## [1] "character"
nrow(data[is.na(data$manufacturer),])
## [1] 14104
Teoreticky by sa tento atribút dal vyplniť opäť pomocou regexu z description, pokiaľ by sme mali ďalší dataset, napríklad zoznam všetkých výrobcov vozidiel, ale rozhodli sme sa tieto chýbajúce riadky odstrániť.
data <- data[!is.na(data$manufacturer),]
Ako sme spomínali v analýze, vhodným spôsobom na vyplnenie chýbajúcich hodnôt modelu by bol positive lookahead v atribúte description. V skratke, v popise inzerátu vezmeme prvé slovo po výrobcovi. Čiže ak výrobcu máme “Mazda” a model je NA, a v prípade, že v description je “Mazda 6”, tak do modelu vložíme “6”.
nrow(data[is.na(data$model),])
## [1] 3774
data[is.na(data$model),]$model <- apply(data[is.na(data$model),], MARGIN=1, FUN=function(row) {
model <- str_extract(row['description'], regex(sprintf("(?<=%s\\s)\\w+", row['manufacturer']),ignore_case=TRUE))
return(model)
})
nrow(data[is.na(data$model),])
## [1] 1654
data[is.na(data$model),][1:5,]
## id
## 206 7239785162
## 646 7236608785
## 1502 7230263884
## 1866 7227495669
## 2308 7234774355
## url
## 206 https://bham.craigslist.org/cto/d/gainesville-2017-ram-4x4-for-sale/7239785162.html
## 646 https://bham.craigslist.org/cto/d/trussville-1955-chevy-belair-hardtop/7236608785.html
## 1502 https://bham.craigslist.org/cto/d/springville-1937-ford-coupe-streetrod/7230263884.html
## 1866 https://bham.craigslist.org/cto/d/trussville-1956-chevy-210-trade/7227495669.html
## 2308 https://dothan.craigslist.org/cto/d/dothan-2005-gmc-ukon-denali-all-wheel/7234774355.html
## region region_url price year manufacturer model
## 206 birmingham https://bham.craigslist.org 18000 2017 ram <NA>
## 646 birmingham https://bham.craigslist.org 26500 1955 chevrolet <NA>
## 1502 birmingham https://bham.craigslist.org 40900 1937 ford <NA>
## 1866 birmingham https://bham.craigslist.org 18000 1956 chevrolet <NA>
## 2308 dothan https://dothan.craigslist.org 4500 2005 gmc <NA>
## condition cylinders fuel odometer title_status transmission VIN drive
## 206 <NA> <NA> gas 95000 clean automatic <NA> <NA>
## 646 <NA> <NA> gas NA clean automatic <NA> <NA>
## 1502 excellent 8 cylinders gas 13000 clean automatic <NA> rwd
## 1866 <NA> <NA> gas NA clean automatic <NA> <NA>
## 2308 fair <NA> gas 158000 clean automatic <NA> <NA>
## size type paint_color
## 206 <NA> <NA> <NA>
## 646 <NA> <NA> <NA>
## 1502 full-size coupe blue
## 1866 <NA> <NA> <NA>
## 2308 <NA> <NA> red
## image_url
## 206 https://images.craigslist.org/00n0n_cZ7RUc9IPIl_0t20CI_600x450.jpg
## 646 https://images.craigslist.org/00t0t_5M1MnIqbIVS_0CI0t2_600x450.jpg
## 1502 https://images.craigslist.org/00202_9Zl2QmcObLX_0CI0t2_600x450.jpg
## 1866 https://images.craigslist.org/00A0A_2DwIHe0Xiuv_0CI0t2_600x450.jpg
## 2308 https://images.craigslist.org/00q0q_iGsYoGLAtG9_0lM0t2_600x450.jpg
## description
## 206 Great truck everything works ding on back finder 4x4 works great hwy miles $18,000 cash 💸 cell # show contact info
## 646 For Sale or Trade ??(BUT TRADE IS HIGHER )I have a 1955 Belair Hardtop,The 55 still needs a little work but the hard stuff is done my price will go up the more work I do still needs door panels and head liner Might have them made in the next few days, Had a 4 speed in it the pedal still in it and the 55 has power disk brakes has a mild built 350/350 transmission, Paint is new and the bumpers front and rear are new, Still needs a few pieces of chrome but could have them in a couple of days,The Original Seats have been recoverd and look good new Rims and Tires . Thanks Doug I Like OLD 30 sand 40s Coupes and Muscle cars
## 1502 Minotti glass, 350 Corvette engine,700R, Mustang II, 8 inch Ford, Coddington wheels,Walker, Lokar, Dakota Digital,Vintage Air, P/B, P/S, cruise, tilt, Carrera, Pdrs, P/T, grey leather, Sony Sound, Boston Acoustics, ghost flames, loaded. NSRA Safety 23 inspected. Call show contact info . No Texts Please.
## 1866 (TRADE IS HIGHER ) I have a 1956 210 with a 350/350 motor and transmission shifts good lights all work has good brakes , runs and drives good,the 56 has been under a carport for a few years just sitting but still runs and drives good. Car has a little rust in the pans but has been patch so you can drive it while you work on it, Im still working on it every day so price will change I'm asking 18k look around and see what a Tri 5s is going for, I like to trade but be real with your offers. AND NOTHING OVER 1972 I LIKE OLD 30s COUPES AND OLD HOT RODS ,CAR SOLD WITH BILL OF SALE THANKS DOUG. 205-five 08- six112
## 2308 2005 UKON ALL WHEEL DR. 257000 MI. GOOD. MECHANICAL COND. FOR MILAGE! USE NO OIL, NO SMOKE, FAIR BFG AT. TIRES
## state lat long posting_date
## 206 al 32.82100 -88.15890 2020-12-01T07:50:35-0600
## 646 al 33.63390 -86.59810 2020-11-24T18:54:33-0600
## 1502 al 33.76742 -86.46558 2020-11-13T10:23:53-0600
## 1866 al 33.63390 -86.59810 2020-11-08T11:39:44-0600
## 2308 al 31.14810 -85.37180 2020-11-21T11:51:54-0600
Náhradu považujeme za úspešnú, dokázali sme takto nahradiť vyše polovicu chýbajúcich hodnôt. Takýto spôsob síce nie je 100-percentný, ale aspoň niečo. Nie vždy sa názov modelu nachádza za názvom výrobcu. Spôsob ktorým by sme toto teoreticky mohli riešiť je vytvorenie histogramu modelov, a určenie thresholdu (hranice) a nahradenie modelov, ktoré budeme považovať ako outlierov naspäť za NA. Napríklad, v prípade že nejaký model je vo všetkých inzerátoch len 1 alebo 2x, je vysoká pravedepodobnosť že ide o chybu.
typeof(data$condition)
## [1] "character"
nrow(data[is.na(data$condition),])
## [1] 132279
unique(data$condition)
## [1] "good" "excellent" NA "like new" "fair" "salvage"
## [7] "new"
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$condition),]
typeof(data$cylinders)
## [1] "character"
nrow(data[is.na(data$cylinders),])
## [1] 44763
unique(data$cylinders)
## [1] "8 cylinders" "4 cylinders" "6 cylinders" NA "10 cylinders"
## [6] "5 cylinders" "3 cylinders" "other" "12 cylinders"
S týmto atribútom by sme mohli spraviť nasledovné:
Aby sme s ním neskôr mohli pracovať, premeníme ho na numerický nominálny atribút a neznámu hodnotu other ozačíme ako NA.
data[which(data$cylinders == 'other'),]$cylinders <- NA
data[!is.na(data$cylinders),]$cylinders <- apply(data[!is.na(data$cylinders),], MARGIN=1, FUN=function(x) str_extract(x['cylinders'],'\\d+'))
A ešte prekonvertujeme na integer
data$cylinders <- as.integer(data$cylinders)
unique(data$cylinders)
## [1] 8 4 6 NA 10 5 3 12
Dropneme riadky
data <- data[!is.na(data$condition),]
typeof(data$fuel)
## [1] "character"
nrow(data[is.na(data$fuel),])
## [1] 9
unique(data$fuel)
## [1] "gas" "diesel" "other" "hybrid" "electric" NA
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$fuel),]
Atribút obsahuje vychýlené hodnoty počtu najazdených kilometrov (míľ)
typeof(data$odometer)
## [1] "double"
summary(data$odometer)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0 39600 89000 108566 138124 2043755555 16227
boxplot(data$odometer, las=3)
ggplot(data = data, aes(sample=odometer)) +
stat_qq() +
stat_qq_line() +
scale_y_continuous(breaks = seq(0, 1000000, by = 250000))
## Warning: Removed 16227 rows containing non-finite values (stat_qq).
## Warning: Removed 16227 rows containing non-finite values (stat_qq_line).
Rozhodli sme sa dropnúť nad určený threshold.
top_threshold <- 1000000
data <- data[!data$odometer > top_threshold, ]
data <- data[!is.na(data$odometer),]
data$odometer = percentil(data$odometer)
boxplot(data$odometer, las=2)
ggplot(data = data, aes(sample=odometer)) +
stat_qq() +
stat_qq_line() +
scale_y_continuous(breaks = seq(0, 5000000, by = 50000))
typeof(data$transmission)
## [1] "character"
nrow(data[is.na(data$transmission),])
## [1] 52
unique(data$transmission)
## [1] "other" "automatic" "manual" NA
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$transmission),]
VIN číslo odstraňovať nebudeme, pri tomto atribúte chceme zistiť, či má inzerát vyššiu cenu pokiaľ tento atribút existuje.
Pridáme si však atribút ktorý neskôr budeme potrebovať, a to boolean hodnotu či dané VIN máme.
data$VIN_defined = apply(data, MARGIN=1, FUN=function(x) !is.na(x['VIN']))
typeof(data$drive)
## [1] "character"
nrow(data[is.na(data$drive),])
## [1] 37021
unique(data$drive)
## [1] "rwd" "fwd" NA "4wd"
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$drive),]
typeof(data$size)
## [1] "character"
nrow(data[is.na(data$size),])
## [1] 84329
unique(data$size)
## [1] NA "full-size" "mid-size" "compact" "sub-compact"
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$size),]
typeof(data$type)
## [1] "character"
nrow(data[is.na(data$type),])
## [1] 1912
unique(data$type)
## [1] "pickup" "SUV" "sedan" "truck" "van"
## [6] "convertible" "hatchback" "coupe" "mini-van" NA
## [11] "wagon" "other" "offroad" "bus"
Okrem dropnutia chýbajúcich riadkov s týmto atribútom neplánujeme momentálne nič robiť. Vo fáze predikcie by sme ho one-hot encode-li.
data <- data[!is.na(data$type),]
typeof(data$long)
## [1] "double"
summary(data$long)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -160.88 -104.70 -86.79 -92.28 -79.93 150.90 570
data <- data[!is.na(data$long),]
Príprava dáta pre 4. hypotézu - podľa zemepisnej dĺžky rozdelíme inzeráty na východ, stred a západ USA. Hodnoty zemepisnej dĺžky sme približne určili podľa tejto mapy:
mapa usa
data$location <- ifelse(as.double(data$long) > -85, 'east',
ifelse(as.double(data$long) < -110, 'west', 'mid')
)
Ako sme spomínali skôr nie je možné dopĺňať stávajúce dáta, pretože prepojenosť dát je príliš veľká na to aby sme mohli jednotlivé záznamy dopĺnať. Mohli by vzniknúť kombinácie záznamov ktoré ani reálne neeexistujú. Preto je pre nás prínosnejšie tieto záznamy odstrániť, než si vedome vnášať vysoký bias do dát. Tieto atribúty sú dôležité pre naše hypotézy ohľadom ceny vozidla, nedal by sa nahradiť nijakým spôsobom (priemer, rozdelenie so zachovaním distribúcie a iné), čo by nám poškodilo autenticitu dát, preto sme sa ich rozhodliť odstrániť.
Väčšinu problémov ktoré sme mali sme vyriešili vymazaním týchto záznamov, alebo ich úpravou pomocou quartilov.
kruskal.test(x = data$price, g = as.factor(data$VIN_defined))
##
## Kruskal-Wallis rank sum test
##
## data: data$price and as.factor(data$VIN_defined)
## Kruskal-Wallis chi-squared = 6230.4, df = 1, p-value < 2.2e-16
ggplot(data, aes(x=VIN_defined,y=price)) + geom_boxplot()
Na boxplote je možno vidieť že priemerná cena vozidla je naozaj vyššia, v prípade že vozidlo má uvedené VIN číslo v inzeráte. Rovnako aj Kruskal-Wallace test vyššiel s p-hodnotou menšiou ako 0.05, týmpádom nulovú hypotézu zamietame (ak je definované VIN číslo, cena sa nemení aj keď nie je definované) a môžme vyhlásiť, že definovanie VIN čísla v inzeráte má vplyv na cenu vozidla.
chisq.test(data$year, data$cylinder, correct=FALSE)
## Warning in chisq.test(data$year, data$cylinder, correct = FALSE): Chi-squared
## approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: data$year and data$cylinder
## X-squared = 7307.4, df = 558, p-value < 2.2e-16
ggplot(data = data, aes(x = year, y = cylinders)) +
stat_summary(geom = "line", fun = mean)
## Warning: Removed 1903 rows containing non-finite values (stat_summary).
Na porovnanie dvoch kategorických (aj keď numerických) atribútov cena a počtu valcov sme vybrali Chi-squared test. P hodnota je veľmi malá (2.2e-16), a preto opäť zamietame nulovú hypotézu. Na grafe môžme taktiež vidieť, že skutočne (aj keď počet vozidiel so zvyšujúcim rokom rastie), so zvyšujúcim rokom sa znižuje počet valcov v motoroch.
data$awd_diesel <- apply(data, MARGIN=1, FUN=function(x) x['drive'] == '4wd' & x['fuel'] == 'diesel')
kruskal.test(x = data$odometer, g = as.factor(data$awd_diesel))
##
## Kruskal-Wallis rank sum test
##
## data: data$odometer and as.factor(data$awd_diesel)
## Kruskal-Wallis chi-squared = 209.39, df = 1, p-value < 2.2e-16
ggplot(data, aes(x=awd_diesel,y=odometer)) + geom_boxplot()
Na boxplote, aj podľa p hodnoty kruskal-wallis testu môžme vidieť že autá s pohonom na všetky kolesá a s typom paliva diesel majú vyšší počet priemerne najazdených míľ.
ggplot(data, aes(x = type, group=location, fill=location)) +
geom_bar(position = "dodge" )
Na grafe môžme vidieť, že vo východnej USA je najväčší počet sedanov. V strednej časti USA sa najviac využívajú SUV autá.
Je jav, že inzerované auto má automat nezávislý od toho či je inzerované auto pickup ?
bn
Pravdepodobnosť, že auto má automat:
print(nrow(data[data$transmission == "automatic", ])/nrow(data))
## [1] 0.9235323
PA = nrow(data[data$transmission == "automatic", ])/nrow(data)
Pravdepodobnosť, že auto je pickup:
print(nrow(data[data$type == "pickup", ])/nrow(data))
## [1] 0.07651808
PP = nrow(data[data$type == "pickup", ])/nrow(data)
Pravdepodobnosti stavov P(M=1|P,A)
MID = 1 PICKUP = 1 AUTOMAT = 1
data_pick = data[data$type == "pickup", ]
data_pick_automat = data_pick[data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p1a1 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.3754455
MID = 1 PICKUP = 1 AUTOMAT = 0
data_pick = data[data$type == "pickup", ]
data_pick_automat = data_pick[!data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p1a0 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.2807775
MID = 1 PICKUP = 0 AUTOMAT = 1
data_pick = data[!data$type == "pickup", ]
data_pick_automat = data_pick[data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p0a1 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.3558854
MID = 1 PICKUP = 0 AUTOMAT = 0
data_pick = data[!data$type == "pickup", ]
data_pick_automat = data_pick[!data_pick$transmission == "automatic", ]
data_pick_automat_mid =data_pick_automat[data_pick_automat$location == "mid", ]
Pm1p0a0 = nrow(data_pick_automat_mid) / nrow(data_pick_automat)
print(nrow(data_pick_automat_mid) / nrow(data_pick_automat))
## [1] 0.303495
Chceme overiť nezávislosť P a A. \(\ P(P,A|M) (nevieme spočítať) = P(A|M) * P(P|M)\) - tento vzťah musí platiť pokial maju byť javy nezávislé.
Spočítame pravdepodobnosť cez všetky stavy:
\(\ P(M=1|P,M) = sum_{P,A} P(M=1,P,A)*P(P)*P(A)\)
Pm1ap = Pm1p0a0*(1-PP)*(1-PA) + Pm1p0a1*(1-PP)*(PA) + Pm1p1a0*(PP)*(1-PA) + Pm1p1a1*(PP)*(PA)
Pm1ap
## [1] 0.3531285
Vypočítame si pravdepodobnosť pre každý stav na základe, že inzerát pochádza zo strednej ameriky.
$ P(P,A|M=1) = $
$ P(P=1,A=1|M=1) $
Pp1a1m1 = Pm1p1a1*(PP)*(PA) / Pm1ap
Pp1a1m1
## [1] 0.07513291
$ P(P=0,A=1|M=1) $
Pp0a1m1 = Pm1p0a1*(1-PP)*(PA) / Pm1ap
Pp0a1m1
## [1] 0.8595236
$(P=1,A=0|M=1) $
Pp1a0m1 = Pm1p1a0*(PP)*(1-PA) / Pm1ap
Pp1a0m1
## [1] 0.004652342
$(P=0,A=0|M=1) $
Pp0a0m1 = Pm1p0a0*(1-PP)*(1-PA) / Pm1ap
Pp0a0m1
## [1] 0.06069112
Teraz sa pozrieme na druhú časť vzťahu \(\ P(A|M) * P(P|M)\)
1.\(\ P(P=1|M=1) = \sum_{A} P(P=1,A|M=1)\)
Pp1m1 = Pp1a1m1 + Pp1a0m1
Pp1m1
## [1] 0.07978525
1.\(\ P(A=1|M=1) = \sum_{P} P(P,A=1|M=1)\)
Pa1m1 = Pp1a1m1 + Pp0a1m1
Pa1m1
## [1] 0.9346565
Ak by javy P a A boli nezávislé, tak musí pre všetky stavy platiť, že: $ P(P=1,A=1|M=1) =P(P=1|M=1) * P(A=1|M=1) $
if(Pp1a1m1 == (Pp1m1*Pa1m1))
{
print('Nezávislé')
} else {
print('Závislé')
}
## [1] "Závislé"
BN